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by 
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and 
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Massachusetts  Institute  of  Technology 
ABSTRACT 


This  paper  offers  a  method  for  combining  the 
results  of  diverse  experiments  when  there  is  uncertainty 
about  the  relevance  of  some  experiments  to  others.   Within 
a  Bayesian  framework  motivated  by  Lindley  and  Smith  (1972)  , 
the  method  is  used  to  assess  human  cancer  risks  from 
heterogeneous  toxicological  and  epidemiological  data.   A 
distinction  is  drawn  between  the  sampling  error  of  each 
experiment  and  an  error  of  relevance  among  experiments. 
The  latter  error  reflects  the  uncertainty  of  interspecies 
extrapolations.   It  is  shown  how  the  experimental  data, 
along  with  prior  information  on  the  credibility  of  such 
extrapolations,  permits  estimation  of  the  human  carcino- 
genic effects  of  various  environmental  emissions.   A  cross- 
validation  method  is  proposed  for  selecting  the  most 
relevant  subset  among  an  array  of  experiments  by  eliminat- 
ing  those  species  or  environmental  agents  which  contribute 
most  to  the  extrapolative  error.   Finally,  other  types  of 
prior  information  on  the  relationships  between  experiments 
are  incorporated  into  the  analysis. 
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1.  INTRODUCTION 
This  paper  offers  a  statistical  method  for  combining  the  results  of 
diverse  experiments  when  there  is  uncertainty  about  the  relevance  of  some 
experimental  results  to  others.  Our  analysis  is  motivated  by  the  increasing 
prominence  of  public  policy  problems  in  which  decision  makers  call  on  multiple 
disciplines  for  advice.  We  seek  an  "enlargement  of  statistical  techniques  to 
encompass  research  programs  rather  than  one   at   a   time   studies..." 

(Schneiderman,  1966) . 

We  apply  our  method  to  the  specific  problon  of  assessing  human  cancer 
risk  from  an  environmental  agent  when  epidemiological  data  are  imprecise  or 
absent,  but  when  precise  toxicological  studies  in  various  species  are 
available.   This  problem  is  more  complicated  than  that  posed  by  Cochran 

(1980) ,  in  which  experiments  of  very  similar  design  differed  primarily  in 
their  date  anci  location.  In  our  problem,  sane  experiments  may  be  performed  in 
vivo,  while  others  are  conducted  in  cell  culture  or  in  subcellular  systems. 
Frequently,  the  experiments  involve  different  conpouncte  or  mixtures  of 
conpounds.  Interspecies  conparisons  are  invariably  required.  Ideally,  the 
exact  relations  among  these  experiitents  should  be  determined  fron  fundamental 
advances  in  understanding  the  etiology  of  cancer.  Our  more  modest  goal  here 
is  to  provide  a  statistical  framework  that  permits  scientists  to  ccmbine  the 
experimental  results  with  their  own  prior  judgments  to  reach  quantitative 
conclusions.  l!his  objective  is  similar  in  spirit  to  those  of  Freireich  et  al. 

(1966) ,  who  compared  the  toxicity  of  anti-cancer  agents  in  mouse,  rat, 
hamster,  dog,  monkey,  and  man;  Meselson  and  Russell  (1977) ,  who  compared  the 
mutagenic  and  carcinogenic  potency  of  14  compounds;  and  Crouch  and  Wilson 

(1979,1980),  who  examined  the  relative  potencies  of  several  chemical 
carcinogens  in  various  pairs  of  species,  most  extensively  in  rats  and  mice. 
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The  main  idea  behind  our  approach  is  to  characterize  precisely  the 
different  sources  of  variation  among  experiments.  In  our  method,  the  results 
of  each  experiment  are  summarized  by  a  single  number,  such  as  the  slope  of  a 
dose-response  relation.  Each  slope  has  an  approximate  standard  error.  These 
errors  of  measurement  are  assumed  to  be  independent.  The  actual  slopes,  we 
hypothesize,  lie  near  the  response  surface  of  an  underlying  regression  model. 
Since  same  environmental  agents  may  have  distinctive  effects  in  some  species, 
this  regression  model  necessarily  entails  some  error.  The  critical  factor 
linking  the  experiments  is  the  scientist's  2.  priori  information  on  the 
exchangeability  of  these  errors  of  interspecies  extrapolation. 

These  ideas  are  formalized  within  a  Bayesian  framework  similar  to  that  of 
Lindley  and  Ehiith  (1972) .  We  assign  prior  distributions  for  the 
"hyperparameters"  of  the  underlying  regression  model.  Given  these  prior 
distributions  and  the  experimental  data,  we  compute  the  posterior 
distributions  of  the  dose-response  slopes.  Eknpirical  Bayes  versions  of  our 
procedures  are  also  presented. 

In  the  next  section,  we  pose  a  problem  in  the  assessment  of  human  lung 
cancer  risk  from  a  number  of  environmental  emissions  that  contain  polyarontatic 
hydrocarbons.  Section  3  formally  develops  our  approach.  Sections  4  and  5 
then  apply  our  method  to  the  data.  In  Section  6,  we  offer  a  siitple 
cross-validation  procedure  to  assist  in  deciding  which  experiments  are  worth 
including.  In  Section  7,  we  discuss  the  case  where  a  scientist  has  prior 
information  on  the  relationships  between  experiments.  The  final  section 
critically  reviews  our  approach  and  suggests  further  lines  of  investigation. 

Our  calculations  in  this  paper  are  illustrative.  We  do  not  propose  here 
to  draw  conclusions  about  the  public  health  significance  of  various  ambient 
concentrations  of  pollutants.  This  would  require  a  more  thorough  discussion 
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than  space  permits.  Nor  do  we  attach  special  significance  to  the 
dose-response  models  of  carcinogenesis  from  which  our  data  were  derived. 
Harris  (1981)  has  discussed  the  limitations  of  the  use  of  such  dose-response 
estimates  in  predicting  excess  cancer  incidence  from  ambient  population 
exposures. 

2.  THE  PROBLEM 

Table  1  displays  the  results  of  two  types  of  carcinogenesis  studies  of 
two  related  environmental  emissions,  arranged  in  a  2«2  table.  For  each  of  the 
four  experiments,  three  numbers  are  given;  the  observed  slope  of  the 
dose-response  relation;  its  ODefficient  of  variation  (i.e.^  the  ratio  of  the 
standard  error  of  the  observed  slope  to  its  mean) ;  and  the  natural  logarithm 
of  the  observed  slope.  The  first  row  of  experiments  represents  the  results  of 
epidemiological  studies  of  occupational  exposures  to  coke  oven  anissions 
(Lloyd,  1971;  Mazumdar  et  al.,  1975;  U.S.  Environmental  Protection  Agency, 
1979)  and  to  roofing  tar  anissions  (Hammond  et  al.,  1976).  The  second  row 
represents  the  results  of  skin  tumor  initiation  experiments  on  the 
dichloranethane  extracts  of  these  emissions.  'The  latter  experiments  were 
performed  under  identical  conditions  in  the  same  laboratory,  as  part  of  the 
U.S.  Environmental  Protection  Agency  diesel  onission  research  program  (Nesnow 
et  al.,  1979;  Huisingh  et  al.,  1979).  The  slopes  and  their  standard  errors 
were  estimated  by  maximum  likelihood  methods,  as  described  in  Harris  (1981) . 

Our  goal  is  to  improve  the  precision  of  the  estimated  dose-response 
slopes  for  the  human  studies.  The  main  question  is  how  to  use  all  of  the  data 
in  Table  1  to  achieve  this  objective. 

One  difficulty  is  immediately  apparent.  The  dose-response  slopes  in  man 
and  mice  are  measured  in  different  units.  We  might  attempt  to  convert  all  the 
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TABLE  1. 
2x2  Experimental  Data  Matrix 


Roofing 

Tar 

Emissions 


Coke 
Oven 
Emissions 


Lung  Cancer  (Man)* 


Skin  Tumor  Initiation 
(Senear  Mice) ** 


1.64 
1.41 
0.49 

0.54 

0.04 

-0.63 


4.40 
0.34 
1.48 

2.10 
0.04 
0.74 


(slope) 
(coef .var. ) 
(log  slope) 


♦increment  in  relative  risk  per  10  y^g/m     extractable  organics 
X  years. 
**Papillomas/mouse  per  mg  extract  at  27  weeks. 
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experiments  into  canmon  units,  e.g.?  the  incremental  lifetime  incidence  of 
tumor  per  mgAg  body  weight  per  day,  or  the  age-specific  probability  of  tumor 
per  cumulative  lifetime  dose  per  unit  body  surface  area.  The  choice  of 
conversion  factor,  however,  is  hardly  clear. 

One  way  to  circumvent  this  problem  is  to  consider  the  relative  potencies 
of  the  two  environmental  emissions  in  each  species.  Since  the  dose-response 
slopes  in  each  row  in  Table  1  are  measured  in  the  same  units,  the  ratios  of 
the  slopes  are  comparable  unitless  quantities.  In  fact,  a  natural  hypothesis 
for  cait)ining  these  data  is  that  the  relative  potency  of  the  two  emissions  is 
preserved  across  the  two  biological  systems. 

The  extent  to  which  these  data  adhere  to  such  an  hypothesis  can  be 
ascertained  in  Figure  1,  which  depicts  the  means  and  standard  errors  of  the 
dose-response  slopes  on  a  logarithmic  scale.  (The  error  bars  correspond  to 
the  standard  errors  of  the  log  slopes,  which  have  been  approximated  by  the 
coefficients  of  variation  in  Table  1.)  On  a  log  scale,  the  difference  between 
coke  oven  slope  and  roofing  tar  slope  in  man  is  close  to  the  corresponding 
difference  in  mice.  To  show  this,  we  have  also  drawn  the  (weighted)  least 
squares  parallel  lines  on  Figure  1,  and  the  fit  is  good. 

This  result  could  be  purely  fortuitous.  Ihe  standard  errors  of  the 
epidemiological  data,  especially  for  roofing  tar,  are  relatively  large.  But 
there  is  a  deeper  objection.  The  hypothesis  that  the  relative  carcinogenic 
potency  of  these  two  emissions  is  preserved  across  species  ignores  possible 
interspecies  or  interorgan  differences  in  the  distribution  of  particulates, 
the  extractability  of  particulate-bound  polyarcmatic  hydrocarbons,  their 
clearance,  metabolism,  and  genetic  and  other  repair  mechanisms.  To  claim  that 
the  totality  of  data  in  Table  1  provides  more  information  about  the  human  lung 
cancer  risks  from,  say,  roofing  tar  exposure  than   the   roofing   tar 
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epidemiological  study  alone  is  to  maintain  sane  degree  of  prior  belief  that 
these  interspecies  differences  are  not  too  large.  The  uncertainty  inherent  in 
such  interspecies  extrapolations  clearly  differs  fron  the  conventional 
sampling  error  of  each  experiment.  If  we  are  to  use  all  of  the  data  in  Table 
1  to  estimate  any  one  slope,  then  we  must  devise  some  measure  of  the  extent  of 
this  extrapolative  uncertainty. 

Finally,  there  is  the  objection  that  the  hypothesis  of  preserved  relative 
potencies  will  not  withstand  other  empirical  canparisons.  It  is  possible  that 
such  an  hypothesis  applies  accurately  only  to  the  comparisons  in  Table  1,  but 
not  to  other  bioassays  or  to  other  environmental  otiissions*  However,  if  we 
had  no  prior  belief  that  the  hypothesis  should  hold  any  rtore  exactly  for 
roofing  tar  and  coke  oven  emissions  than,  say,  for  autcniotive  particulate 
anissions  or  cigarette  smoke,  then  any  enpirical  canparisons  that  contradict 
the  hypothesis  would  raise  our  uncertainty  in  the  current  extrapolation. 

3.  STATISTICAL  MDDEL 
3»1  Notation  and  Assumptions. 
Let  y  .  be  the  logarithm  of  the  estimated  dose-response  slope  for  the 
experiment  in  species  k  on  enviromiental  agent  J  ,  In  the  problem  above, 
k=l,2  correspond  to  epidemiological  studies  in  man  and  skin  tumor  initiation 
experiments  in  mice,  respectively,  while  1=1,2  correspond  to  roofing  tar 
enissions  and  coke  oven  emissions,  respectively.  The  variables  y  «  are 
presumed  to  be  approximately  normally  distributed  with  mean  B^^  and  known 
standard  error  c  .  The  assumptions  of  normality  and  known  standard  errors 
are  not  unreasonable,  since  each  y^^  was  a  maximum  likelihood  estimate  based 
on  a  relatively  large  experiment.  The  quantities  0^  are  the  true  log 
dose-response  slopes,  the  primary  parameters  of  interest. 
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We  assume  that  each  Q^    has  a  symmetric  prior  distribution  with  nean 
value  of  the  form 

(3.1)     E[e^l|A,ec^,V^]  =^   ^X^^ij^, 

where  the  hyperparameters  { M-  r  ^^  i  ^j>  }  represent  the  overall  mean  effect , 
species-specific  effects,  and  onission-specific  effects,  respectively.  In  our 
Bayesian  framework,  these  hyperparameters  in  turn  have  prior  distributions. 
Equation  (3.1)  embodies  the  hypothesis  that  the  relative  potency  of  the  two 
emissions  is  on  average  preserved  across  species.  Moreover,  the  relative 
potencies  are  a.  priori  just  as  likely  to  be  larger  for  one  species  than  for 
the  other.  The  various  Q^^^  are  measured  in  different  units,  since  they  are 
the  logs  of  dose-response  slopes  for  quite  different  dose-response 
experiments.  The  additive  model  (3.1)  is  meaningful,  however,  so  long  as 
(0„  -9,2^)  -  ^^ti~®,,^  is  a  dimensionless  quantity,  a  condition  satisfied  in 
our  problem  above.  In  that  case,  the  units  of  measurement  for  f^  f  ^k.  /  snd 
Vb   can  be  chosen  so  that  the  quantities 


^Ki        =     %     -^-^-'^l 


are  similarly  dimensionless.  Each  S  g  is  a  species-emission  interaction 
effect,  measuring  the  amount  (on  a  log  scale)  by  which  the  experiment  in 
species  k  on  emission  Ji  deviates  fron  the  constant  relative  potency 
hypothesis. 

Conditional  on  the  value  of  another  hyperparameter  CT  ,  we  further  assume 
that  the  5^^  are  independently  distributed  a.  piiori  as  N(0,<r  ).  Under  this 
critical  assumption,  the  interaction  effects  5^   are  a  priori  exchangeable 
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(de  Finetti,  1964).  That  is,  we  have  no  prior  information  that  a  deviation  of 
a  given  magnitude  f ran  the  constant  relative  potency  model  is  more  likely  in 
one  experiment  than  in  any  other. 

We  take  care  here  to  elucidate  the  precise  meaning  of  this  form  of  prior 
information.  We  recognize  that  the  mechanisms  of  carcinogenesis  may  vary 
considerably  among  agents  or  species.  Quite  different  metabolic  pathways  may 
be  involved.  The  number  of  stages  in  expression  of  tumor  may  differ.  Any 
variation  that  is  distinctive  to  a  particular  agent  in  a  particular  species 
could  result  in  a  marked  deviation  fron  our  additive  model.  The 
exchangeability  hypothesis  does  not  exclude  the  possibility  of  such 
deviations.  It  merely  states  that  we  cannot  identify  a.  priori  which  entry  in 
our  two-way  table  of  experiments  is  likely  to  have  the  largest  deviation. 

The  hyperparameter  «"  measures  our  belief  in  the  degree  of  accuracy  of 
the  equal  relative  potency  model.  A  value  of  ff'=0.05,  for  example,  implies 
that  within  one  normal  standard  deviation,  i.e.,  with  probability  0.68,  the 
additive  model  is  accurate  to  within  an  absolute  error  of  0.05?  or 
equivalently,  each  dose-response  slope  conforms  to  the  underlying  equal 
relative  potency  model  to  a  multiplicative  factor  of  exp(0,05)si,05.  A  prior 
belief  that  C  is  of  this  magnitude  thus  implies  a  relatively  high  degree  of 
confidence  in  the  underlying  model.  On  the  other  hand,  a  value  of  ^=5 
implies  that  with  probability  0,68,  each  dose-response  slope  conforms  to  the 
underlying  model  to  a  multiplicative  factor  of  exp(5)=150,  A  belief  that  ^ 
is  of  this  magnitude  implies  much  less  faith  that  the  experiments  can  be 
profitably  ccinbined. 

We  now  generalize  beyond  the  2x2  case  considered  above,  letting  {  y  .  ± 
c^j  ?  k=l,...,K  ;5=1,...,L  }  be  a  set  of  experimental  observations  on  the 
log  dose-response  slopes  for  K  species  and  L  environmental  agents.  We  also 
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admit  the  possibility  that  some  y^  are  missing  from  a  set  of  conteroplated 
experiments.  Except  where  otherwise  noted  below,  we  assume  that  the  set  of 
available  experiments  is  connected,  in  the  sense  that  any  available  experiment 
(k,J^)  can  be  reached  fron  any  other  available  experiment  (k',^')  by  a  series 
of  rtsDves  f  rem  one  available  experiment  to  another  in  which  each  move  is  along 
a  single  row  or  column.  Conditional  on  6° ,  the  observed  y  .  are  generated 
by  the  linear  model 

(3.2)     y^^  =ix  ^^■'^■'^^li   ^^a' 

where  the  three  sets  of  variables  ^i^'^^/^s  ^'     ^   ^tfi  ^'  ^^^^  ^  ^kJL^    ^^^ 

t  ^ 

independent  a  priori,  with  the  o^    i.i.d.  N(0,(r)  and  the  £  .  independent 

z 

N(0,c  q).  Following  the  usual  general  linear  model  formulation,  we  further 

replace  the  expressions  p.  +'^(t.  +  ^s  in  (3.2)  by  X^  ,  where  ^  is  a  column 
vector  of  hyperparameters  and  X  is  an  appropriately  chosen  design  matrix. 
Of  the  K+L+1  hyperparameters  in  ^P-^'^ji'^i,'^  '  ^^  niost  K+L-1  are 
independently  estimable  in  the  classical  sense.  So  long  as  we  use  an 
informative  full  rank  prior  distribution  on  all  K+L+1  hyperparameters,  no 
restrictions  on  these  hyperparameters  are  necessary  in  our  Bayesian  framework. 
In  other  cases,  however,  particularly  when  a  diffuse  prior  distribution  is 
anployed,  we  shall  assume  that  ^  corresponds  only  to  the  independently 
estimable  canponents  of  ^^^r^^r^j^'i  and  that  X  is  the  corresponding  full 
rank  design  matrix. 

Finally,  we  assume  that  ^  is  a  priori  multivariate  normal  with  mean 
vector  b  and  covariance  matrix  V  ,  and  that  O"  has  a  prior  distribution 
IT  with  density  'n'(<7)  .  Now  let  i  =  l,...,n  index  the  experiments, 
replacing  the  paired  indices  (k,i.) .  Let  m  be  the  rank  of  X  ,  that  is,  the 
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number  of  independently  estimable  elements  of  ^ '^  '  "^i^' ^  ■^  *  ^^  Yr&r^r     and 
£,   be  nxl  column  vectors  replacing   ^y^p^  '  ^^<i?    '   ^^*<^^  *  ^^*^  ^^\iS}   ' 
respectively.   Let   I   be  the   nxn   identity  matrix,  and  let   C   be 
diag(c^  ,...,c*  )  .  Our  model  can  be  formulated  generally  as 

y  =  X(l  +  S  +  £ ,  where  0  =  Xfl  +  ^,  and 

(3.3a)  CT  ~  TT  , 

(3.3b)  (i  ~  N(b,V), 

(3.3c)  (eip,cr)  ~  N(xp,cr*l), 

(3.3d)  (Yie)  ~  N(©,C). 

This  model  possesses  a  hierarchical  structure  similar  to  that  formulated 
by  Lindley  and  Smith  (1972) .  The  experimental  data  Y  and  C  ,  as  well  as 
b  ,  V  ,  and  the  distribution  ft  ,  are  assumed  be  to  known.  The  choice  of  a 
prior  distribution  TT  is  left  unspecified  until  Section  4.  We  note  here 
that  there  is  no  advantage  or  compelling  reason  to  choose  the  inverse 
chi-squared  prior  distribution  for  G~  g  as  proposed  by  Snith  (1973a) .  The 
choice  of  b  and  V  is  more  conplicated,  and  will  be  considered  in  detail 
below.  (Readers  who  are  less  interested  in  the  mathematical  details  of 
estimation  may  wish  to  skim  the  remainder  of  Section  3  and  resume  in  earnest  at 
Section  4. 

3.2  Bayes  Estimates.  Informative  Prior  on  ^ . 
Let  us  suppose  that  a  scientist  has  prior  information  on   ^ ,  which  he 
expresses  through  his  choice  of   b   and  V  .  Such  choices  could  be  made 
directly,  as  we  shall  illustrate  in  Section  7,  or  by  indirect  elicitation,  as 
in  the  method  of  Kadane  et  al.  (1980) . 
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Estimation  of  the  parameters  now  proceeds  by  straightforward  Bayesian 

methodology.   First,  we  conpute  the  marginal  distribution  of  the  data  given 
the  hyperparameter  G*.  From  (3.3), 


(3.4)     (YlC)   ~  N(Xb,C+<rVxVX'). 


In  our  Bayesian  framework,  (3.4)  can  be  regarded  as  the  likelihood  of  c  for 
S  priori  given  values  of  b  and  V  .  The  posterior  distribution  of  <T  is 
therefore 

(3.5)   Tf  (tflY)  ec  Tr(<y)  lC+<r*I+XVX'  r'^*exp{-A(Y-Xb) '  [C+(x'l+XVX' ]~  *  (Y-Xb) } 


where  lAl  is  the  determinant  of  A.  For  future  reference,  we  define  the 
posterior  expectation  of  cT  as 


G-^^      =    fo-*Tr(cr|Y)dc> 


which  can  be  interpreted  as  the  approximate  risk  in  estimating  a  particular 
6   by  XB  under  squared  error  loss. 

Now  consider  the  posterior  distribution  of  fi  .  Denoting  the  posterior 
density  by  f (piY)  ,  we  have  fron  (3.3) 

00 


(3.6)    f(piY)  =  ff(piY,«r)Tr«riY)d<y, 


where  f(^lY,cr)  is  the  multivariate  normal  density  N(|i,V)  ,  and 


(3.7)    p  =  V[X' (C+<3"*i)"  Y  +  v"  b]. 
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V  =  [X'(C+<r*ir'x  +  V"']"'. 


The  posterior  distribution  of  (i  is  a  mixture  of  multivariate  normal 
distributions  with  mixing  probabilities  given  by  (3.5),  the  posterior  density 
of  C  .  Equations  (3.7)  are  derived  by  the  familiar  rule  for  computing 
posterior  distributions  for  the  normal  data,  normal  prior,  known  variance 
case.  (See,  e.g.,  Raiffa  and  Schlaifer  (1968).)  With  W=  (C+c"  I)~  known, 
the  least  squares  estimator  |l  =  (X'WX)  X'WY  has  mean  ^  and  precision 
matrix  (X'WX)  (v^ere,  for  the  sake  of  this  intuitive  argument,  X  here  is 
necessarily  of  full  rank) .  Moreover,  the  prior  distribution  of  ^  has  mean 
b  and  precision  V  .  The  rule  for  computing  the  posterior  distribution  of 
|B  is  to  weight  the  least  squares  estimate  and  the  prior  mean  by  their 
precisions,  with  the  precision  of  the  result  equal  to  the  sum  of  these 
precisions.  In  the  analysis  below,  we  shall  be  interested  in  the  fitted 
values  Xft  fron  the  underlying  constant  relative  potency  hypothesis.  The 
posterior  density  of  XA  is  the  corresponding  mixture  of  multivariate  normal 
densities  N(xp,XVX')  ,  where  the  mixing  probabilities  are  still  Tr(c"lY)  and 
^  and  V  are  defined  in  (3.7). 

Consider,  finally,  the  estimation  of  0.  If  gOlY)  is  the  posterior 
density  of  6  ,  we  have 

90 

(3.8)    g(eiY)  =  rg(6lY/r)ir(0'lY)d<r, 
where  g(9lY,0")  is  the  multivariate  normal  density  N(©,C)  ,  and 


(3.9)    ©  =  c[c''y  +  (xyx'+o-^D-'xh], 

C  =   [C"'  +  (XVX'+A)"']"'  . 
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The  posterior  distribution  of  0  is  similarly  a  mixture  of  normal 
distributions.  The  means  and  covariances  of  these  normal  distributions,  given 
by  (3.9),  are  derived  in  a  manner  analogous  to  (3.7),  where,  by  (3.3)  ,^\&,^~ 
N(6,C)  and  Q\<r  ~  N(Xb,XVX'+o- I)  .  Each  9  is  a  weighted  average  of  the 
original  data  Y  and  the  corresponding  prior  prediction  Xb  from  the 
underlying  constant  relative  potency  model,  where  the  weights  are  the 
corresponding  precisions. 

3.3  Bayes  Estimates.  Diffuse  Prior  on  /3  . 
Calculation  of  the  posterior  distributions  (3.5),  (3.6),  and  (3.8) 
requires  us  to  specify  the  mean  b  and  the  covariance  matrix  V  of  the  prior 
distribution  of  |S.  In  many  situations,  however,  information  about  the 
hyperparameters  p  will  be  extremely  vague.  That  is,  the  prior  covariance 
iratrix  V  will  be  large.  To  investigate  such  cases  in  detail,  we  first  need 
the  following  lemma. 

Lemma;  Let  U  be  an  nxm  matrix  of  rank  m<n  ,  I  be  the  nxn  identity 
matrix,  and  t  be  a  scalar.  Then  as  t  » ©»  , 


(3.10)  d+tUU')"*  =  I  -  U(U'U)''u'  +  t~'(UU')*  +0(t"*), 

(3.11)  ll+tUU'l  =  t'^IU'Ul  [l+t''tr(UU')*-K)(t'*)], 


where  A"*"  is  the  Moore-Penrose  pseudo-inverse  of  A. 

Proof:  The  nxn  matrix  UU'  ,  which  has  rank  m,  can  be  represented  as 


UU'  =  21  ^iU.uJ  , 
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where  iXi)     are  the  strictly  positive  characteristic  roots  of   UU'   and 
{u'}   are  the  corresponding  characteristic  vectors.  The    n«n  identity 
matrix  can  be  represented  as 


where  the  unit  vectors  {v- }  are  all  orthogonal  to  the  characteristic  vectors 

{u:}.  Combining  these  two  expressions^  we  have 
3 


+  tUU'   =   2  (l+t>;)u.uj  +   21  V;v!  . 


New 


d+tUU')"   =  2  ^l"''t\)'"'u.u'.  +  2^  v.v!  . 

As  t-»0o, 

^2  (l+t>-)"'u,u',  =  t"'^  X"'u  u'  +  0(t"^). 

Equation  (3.10)  now  follows  fron  our  recognition  that  2    v.v'-   is  the 
orthogonal  projection  operator  which  maps   R   onto  the  subspace  of  R 
orthogonal  to  the  columns  of  U  (namely  I-U(U'U)~  U'),  while  ^  ^^^  u  u' 
defines  the  pseudo-inverse.  Similarly 

*!         Jl      ^   , 

ll+tUU'l   =    jT(l+tV)   =   t'^Tl\(l+t'"*^>;,+0(t"*)). 
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Equation  (3.11)  follows  fron  our  recognition  that  lU'Ul  =  TTA'  and  tr(tlU') 


J' 


.  ) 


We  now  have  the  following  result. 

Proposition;  If  the  nonsingular  covariance  matrix  V   is  replaced  by   tV, 
where  t  is  a  scalar,  then  as  t  -^  oo  : 

(a)  The  posterior  density  of  cT  approaches 

(3.12)  "IT(criY)  oC  TT(o-)IW|'^X'WXr''*exp{-iY'SY}  , 

where  W  =  (C+/l)~'  and  S  =  W-WX(X'WX)"'x'W. 

(b)  The   posterior   density   of    (3   approaches    f(fllY)    = 
f  (pi Y,(y)"7r  (<yiY)d<r,  where  f(piY^)  is  multivariate  normal  N(|5,V)  and 

(3.13)  p      =    ^^'(C+c^*I)"'Y 

V  =  [X'(C+(r*I)"'x]''  . 

(c)  The   posterior   density   of   &        approaches    g(0lY) 

j  g(9lY,a)7r  (cylY)dC,  where  g(©lY,<T)  is  multivariate  normal  N(e,C)  and 

(3.14)  ©  =  &''y  =  (I+CR)"'y 
C  =  (C"'+R)'*', 

where  R  =  <f  [i-X(X'X)"'x'] . 

Proof:   (a)  Note  that  the  quadratic  form  Y'SY  in  (3.12)  is  the  sum  of 
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squared  residuals  of  the  weighted  least  squares  regression  of  Y  on  the 
columns  of  X  r  where  the  weights  are  the  diagonal  elements  of  W  .  That  the 
quadratic  form   (Y-Xb) '  [C+cr*l+XVX']~' (Y-Xb)   in  (3.5)  reduces  to  Y'SY  in 

'A.  '/», 

(3.12)  is  a  result  of  expansion  formula  (3.10),  where  we  set  U  =  W  XV  and 
S  =  w''''(I-U(U"U)"' U')w'''*:  That  the  determinant  |C+<r''l+XVX'r ''•  in  (3.5) 
becomes  proportional  to  )wl  IX'WXl  in  (3.12)  is  a  result  of  expansion 
formula  (3.11)  under  the  same  definition  of  U.  (As  noted  in  Section  3.1,  we 
assume  here  a  parametrization  in  which  X  is  of  full  rank.) 

(b)  Expressions  (3.13)  follow  from  our  setting  V   =  0   in  expressions 
(3.7). 

(c)  That  expression  (3.9)  reduces  to  (3.14)  is  a  result  of  the  expansion 
formula  (3,10),  where  we  set   U  =  XV  *•  in  order  to  evaluate  the  terms 

(XVX'  +  e-i)    in  (3.9).  We  note  also  that  equation  (3.14)  is  a  special  case 
of  equation  (Al)  of  Snith  (1973a).  ^ 


3.4.  Empirical  Bayes  Estimates. 

In  the  analysis  below,  we  shall  also  consider  empirical  Bayes  approaches 
to  estimating   8  .   In  these  alternative  methods,  we  retain  the  prior 
distribution  9ip,«^  ~  N(X^,o-  I)  as  specified  in  (3.3c),  but  use  the  data  Y 
itself  to  construct  the  prior  distributions  on  cr    and  |5  . 

Several  options  are  available.  (As  Dempster  (1980)  notes,  "there  is  no 
such  thing  as  ibe  empirical  Bayes  estimator.")  First,  we  could  estimate  both 
<r  and  p  frcm  the  data  Y  ,  by  maximum  likelihood  or  other  methods,  and 
then  assume  that  the  entire  prior  density  for  cr  is  concentrated  at  the 
estimate  <r    and  the  entire  prior  density  of   ^   is  concentrated  at  the 

A 

estimate  ^  .   For  a  given  cr ,   the  maximum  likelihood  estimate  of  fi   is  the 
least  squares  estimate 
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^   (<r)  =  (x'(C+<r'l)"'x)"'x'(C+ff^r'Y  . 


As  a  function  of  cr  ,   the  concentrated  likelihood,  evaluated  at  &  =  J3^,  -  ,   is 
proportional  to 

L(o-)   =  lwr'''exp{-iY'(W-WX(X'WX)~'x'W)Y} 


a.  - 1  A 

where,  again,  W  =  (C+o"  I)   .  If  we  denote  cr„,  g.      as  the  value  of  cr 

maximizing  this  likelihood  function,  then  the  resulting  empirical  Bayes 
posterior  distribution  of  Q     is  N(©   /C^, -)  r   where 

'^  MCE.  '^ 

c   =  tc-' +  &■-"•  u-'. 

lAtS.  Mt£ 


This  estimate  treats  b    as  if  it  were  fixed  and  known  s.  priori,  even  though 
the  estimate  Ptf^ie    ^^   used. 

Alternatively,  we  could  assume  a  diffuse  prior  on  fi  and  estimate  only 
c7  from  the  data  Y.  In  this  case,  the  appropriate  likelihood  function  for 
(f     is  equation  (3.12)  with  the  prior  density  ITCo")  onitted,  that  is, 

L*(C)  =  IWl '^'•IX'WXl'^^expC-^Y'SY}. 


Let  ^„  be  the  value  of   C   that  maximizes   L*(cr)  .   The  corresponding 

Co 


empirical  Bayes  posterior  distribution  treats  (T  =  or^^  as  if  it  were  known 
with  certainty,  and  so,  by  equation  (3.14),  is  NO  ,Cg-)  ,  where 
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(3.16)     e^^  =  [I+crggC(l-X(K'xr'x')]-'Y 


It  is  interesting  to  note  that  in  the  case  where  cr     is  known.  Smith 
(1973b)  shows  liaat  i^;„y?^\j«pJlta,Qj^.jPj^^i  a^^d,  .c^g  by^.a ,  then  6^^  =  6^^.   It  is 

A         A  A 

clear  fron  comparison  of  (3.15)  with  (3.16)  that  if  CT       =  cr      ,  then  C   ^ 
C    (in  the  sense  that  C^^e  ~  "^eb  """^  non-negative  definite).  Moreover,  if 

A 

since  they  maxiinize  Lds-)  and  L*(cr)  ,  respectively,  and  since  the  ratio 

L(c7)/L*(cr)  =  IX'  (C+a  I)  XI     can  be  shown  to  be  a  decreasing  function  of 

/>■ 

cr  .  Since  L*  is  the  product  of  L  and  an  increasing  function  of  cr  ,  its 

maximum  will  occur  later  than  that  of  L  .  The  assumption  that  |3  =  p^Lg, 
with  certainty  leads  to  a  smaller  posterior  variance  for  B  than  the  diffuse 
prior  for  p   .  "Hierefore,  to  the  extent  that  /3  is  a  priori  uncertain,  the 

A 

variance  C    is  inappropriately  small.  In  the  results  below,  we  shall 
therefore  report  the  Empirical  Bayes  estimate  (3.16). 


Finally,  if  we  wish  to  avoid  the  corputational  burden  in  determining 
'  „  ,  we  could  begin  with   1^  ..  ( 

&»  '  WiLE. 


^MLE.   0^^  ^&s  '  ^^  could  begin  with   P^.JO)  =  (X'C''x)~'x'C"'y  .  The 


residual  sum  of  squares  for  this  estimate,   RSS  =  ^I  (y^-x^M  /c^ ,     has 

r»  *•  =  ' 


'     -^^       -2-    .   -"'.. /•,i(-.-l-v\-l  vir*  ~ '  ■ 


expectation   E[RSS]  =   n-m  +  a[  Zl,  c^   -  tr  C  X(X'C"  X)"' X'C"']  ,  which 
suggests  the  estimate 


t  =  .   I 


(3.17)    o-^      =     [RSS  -  (n^)]/C^c:^-  tr  C~'x(X'C~'X)-' X'C"'] 


where  we  take  cr^^^  =  0  if  RSS  <  n-m  .  (The  exact  value  of  E[RSS]  was 
derived  for  us  by  H.  Chernoff,  whose  proof  is  emitted.)  In  the  results  below, 
we  shall  also  report  the  empirical  Bayes  estimate  of  9  when  cr    is 
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substituted  for  C        in  (3,16), 

4,  TOE  2X2  CASE 
In  the  next  three  sections  of  this  paper,  we  shall  assume  a  diffuse  prior 
on  the  vector  of  hyperparameters  /2>  .  To  be  sure,  a  scientist  may  have  prior 
information  on  the  composition  of  each  emission,  the  carcinogenic  activities 
of  its  constituents,  their  possible  synergistic  interactions,  their 
bioavailability,  etc.  Similarly,  a  scientist  may  have  prior  information  on 
the  sensitivity  of  the  mouse  skin  tumor  initiation  model  in  comparison  to  the 
human  respiratory  tract.  Our  impression,  however,  is  that  this  type  of 
information  is  not  yet  sufficiently  refined  to  offer  much  help  in  specifying  a 
precise  prior  on  ^  .  We  recognize  that  the  use  of  improper  priors  may 
involve  certain  marginal ization  paradoxes.  Discussion  of  these  potential 
difficulties  is  deferred  to  Section  7.  At  this  point,  we  note  that  the 
analysis  of  the  next  three  sections  was  repeated  under  the  assumption  of  a 

4 

proper  prior  for  p  with  V  =  10  I.  The  results  of  all  reported  quantities 
were  unchanged  up  to  the  number  of  decimals  presented. 

Devising  a  prior  distribution  for  the  critical  hyperparameter  cr  is 
another  matter.  Perfect  extrapolation  fron  mouse  to  man  or  from  one 
environmental  agent  to  another  is  clearly  quite  unlikely.  To  claim  that  the 
various  experiments  in  Table  1  are  totally  irrelevant  to  each  other  is 
likewise  too  strong.  The  answer  lies  somewhere  in  between. 

It  seems  reasonable  to  suspect  that  within  a  range  of  one  normal  standard 
deviation,  i.e.,  with  probability  0.68,  the  underlying  constant  relative 
potency  model  could  be  accurate  within  a  multiplicative  factor  of  exp(5)  = 
150,  or  even  exp(0.5)  =  1.6.  To  suspect  that,  with  probability  0.68,  the 
model  could  be  accurate  within  a  factor  of  exp(0.05)  =  1.05  is  more 
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cx»ntroversial.  One  of  us,  in  fact,  found  the  possibility  of  significant 
interspecies  differences  in  particulate  distribution,  extraction,  clearance, 
metabolism,  etc.  so  compelling  that  an  a.  priori  error  factor  of  1.05  seemed 
unimaginable.  On  the  other  hand,  we  both  felt  that  it  would  be  inappropriate 
to  attach  a  uniform  distribution  to  cr  ,  since  there  is  uncertainty  even  in 
the  order  of  magnitude  of  error.  To  articulate  our  differences,  and  ' 
agreements,  we  formulated  two  prior  distributions  on  cr  . 

Prior  A:  log  cr  uniformly  distributed  on  the  interval 

0.05l<r  15;  and 
Prior  B:  log  <j     uniformly  distributed  on  the  interval 

0.5i.fr:15. 

Itiese  distributions  somewhat  artificially  attach  zero  probability  mass  outside 
the  specified  intervals  [0.05,5]  and  [0.5,5],  As  we  shall  see  shortly, 
however,  this  restriction  does  not  significantly  affect  our  main  conclusions. 
Giri  (1970)  has  employed  a  uniform  distribution  on  log  cr  for  0<  cr  <  oo  in 
his  Bayesian  model  for  two-way  ANOVA,  However,  we  prefer  the  use  of  a  proper 
prior  distribution  because  it  conpels  us  to  face  the  task  of  assessing  our 
beliefs.  Priors  A  and  B  retain  the  feature  that,  within  the  relevant 
intervals,  the  posterior  density  of  log  <r  will  be  proportional  to  the 
likelihood  function. 

In  order  to  simplify  the  computations,  we  shall  evaluate  these  prior 
distributions^  and  therefore  the  posterior  distribution  ttCctIy)  ,  only  at 
discrete  points  in  the  relevant  intervals,  equally  spaced  on  the  log  scale. 
This  means  that  the  posterior  distributions  of  Xj5  and  ©  will  be  finite 
mixtures  of  normal  distributions. 
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Figure  2  displays  the  posterior  densities  iTCcrlY)  calculated  from  (3.12) 
for  our  two  distinct  priors.  For  the  2X2  data  matrix  in  Table  1,  the  form 
of  the  posterior  distribution  of  cr  is  clearly  sensitive  to  the  prior 
distribution  that  is  assumed.  In  the  interval  0.051  <7  <.0.5,  in  particular, 
the  likelihood  function  is  relatively  flat.  Yet  the  maximum  likelihood 
estimate  of  cr    is  zero. 

This  finding  is  reflected  in  Table  2,   which  shows  selected  statistics  of 
the  posterior  distributions  of  Xj3  and  6  for  the  epidemiological  studies, 
based  upon  the  two  prior  distributions  of  cr  .  Also  shown  are  the  results  for 
the  empirical  Bayes  estimate  corresponding  to  (3.16).  (Both  o"__   and  cr  ^^ 
were  zero  in  this  case.) 

Although  the  posterior  distribution  of  0  is  a  mixture  of  multivariate 
normals  (recall  (3.8)),  the  resulting  marginal  distributions  did  not  in  fact 
deviate  substantially  from  normality.  If  Q*  =  E[©JY]  and  if  c*  is  the 
standard  deviation  of  9JY  ,  then  the  tail  probabilities 

PrCe-^ef  +  2.326c*  I Y}, 
Pr{e  <  e?^  -  2.326c*lY}, 

do  not  deviate  substantially  from  the  value  of  0.01  predicted  for  the  normal 
density.  Hence,  the  mean  and  standard  deviation  adequately  characterize  the 
marginals  of  the  posterior  distribution  of  6  . 

Because  the  original  coke  oven  data  were  relatively  precise,  the  means 
and  standard  deviations  of  the  posterior  distributions  of  the  coke  oven  log 
slope  do  not  differ  much  from  the  original  values  of  y  and  c.  For  the 
roofing  tar  log  slope,  however,  the  precision  of  the  posterior  distribution 
depends  critically  on  the  estimate  used.  Since  prior  A  admits  the  possibility 
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TABLE  2. 

Bayes  and  Empirical  Bayes  Estimates  of  Log  Slopes 

For  Lung  Cancer  Risk  in  Man 

2x2  Data  Matrix 


Environmental 

Stand. 

Poster. 

Lower 

Uppe  r 

Emission 

Mean 

Dev. 

Mean  X^ 

Tail 

Tail 

Roofing  Tar 

Original  Data 

0.495 

1.415 

e|Y  (Prior  B) 

0.365 

1.152 

0.304 

0.011 

0.013 

elY  (Prior  A) 

0.229 

0.788 

0.205 

0.015 

0.025 

e|Y  (Empirical 

0.135 

0.337 

0.135 

Bayes) 

Coke  Oven 

Original  Data 

1.482 

0.341 

e|Y  (Prior  B) 

1.489 

0.338 

1,550 

0.010 

0.010 

eiY  (Prior  A) 

1.497 

0.334 

1.522 

0.010 

0.010 

&\Y    (Empirical 

1.502 

0.331 

1.502 

Bayes) 

©|Y  (Empirical  Bayes) 
diffuse  prior  on  (3 


assumes  'nr(cr)  concentrated  at 


A 

cr, 


EB 


=  0,  and 


Lower  Tail 
Upper  Tail 
where 
of  0 


0> 


=  Pr{e;l(9*  -  2.326cf|  Y}  , 
=  Pr{e,2e;*+  2.326c.;  I  Y}  , 
and  cf      are  the  posterior  mean 


and  standard  deviation 
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of  lower  values  of  cr  ,  the  corresponding  posterior  distribution  has  a  smaller 
standard  deviation.  For  the  empirical  Bayes  estiinate,  which  in  this  case 
assumes  that  the  underlying  constant  relative  potency  model  holds  exactly,  the 
only  sources  of  variance  in  the  posterior  distribution  of  6  are  the  original 
sampling  errors. 

The  posterior  mean  values  of  6  are  very  close  to  the  corresponding 
posterior  mean  values  of  xp>  .  That  is,  the  posterior  expectations  of  the 
model  residuals  5  are  small.  The  posterior  variances  of  these  residuals, 
however,  are  not  so  small.  Although  the  variance  of  each  posterior  residual 
5-  lY  depends  in  part  on  the  precision  of  the  original  data,  that  conponent 
of  the  variance  due  purely  to  the  underlying  model  is 


(4,1)    Cj*"^  =  EEcr'^lY], 


which  for  prior  A  in  this  case  is  1.088.  In  effect,  if  we  were  to  use  these 
data  to  predict  S  for  another  experiment  yet  to  be  performed,  the  standard 
deviation  of  S  ,  under  Prior  A,  would  be  1.04.  Under  Prior  B,  the  standard 
deviation  of  S      would  be  lo76,  despite  the  fact  that  the  empirical  Bayes 

A 

estimate  of  cr  is  C^^  =0.  (It  is  straightforward  to  show  that  SlY  is 
likewise  a  mixture  of  normals,  each  of  which  has  covariance  matrix  of  the  form 
cr^I  +  D  ,  where  D  vanishes  as  C"  approaches  0,) 

Little  credence,  we  conclude,  can  be  attached  to  the  apparently  close  fit 
of  the  data  in  Table  1  to  the  underlying  constant  relative  potency  model.  Any 
scientist  who  objects  that  the  data  are  just  "too  good  to  be  true"  makes  a 
legitimate  claim  based  on  his  prior  belief  that  such  extrapolative  models  are 
unlikely  to  be  so  accurate.  The  extent  to  which  the  totality  of  data  in  Table 
1  refines  the  precision  of  the  estimated  human  lung  cancer  risk  is,  in  effect. 
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a  matter  of  prior  opinion. 

The  problon  with  the  2x2  case,  it  appears,  is  that  we  don't  have  enough 
precise  experiments.  The  sampling  errors  for  the  skin  tumor  initiation  data 
in  mice  are  so  small  that  the  model  is  fitted,  in  effect,  to  the  mouse  data. 
The  predicted  relative  potencies  for  the  human  lung  cancer  risks  merely  adjust 
to  the  more  precise  non-human  results.  If  we  are  to  learn  any  more  about  the 
extent  to  which  these  experiments  can  be  conbined,  then  we  need  additional 
precise  experiments.  We  now  proceed  in  this  direction. 


5.  -fflE  3X3  CASE 

Table  3  is  an  augmented  version  of  Table  2.  In  addition  to  human  lung 
cancer  epidaniological  studies  and  skin  tumor  initiation  experiments  in  mice, 
we  have  included  experiments  on  the  enhancement  of  viral  oncogenic 
transformation  in  Syrian  hamster  embyro  (SHE)  cells  (Casto  et  al. ,  1979) .  In 
addition  to  studies  on  roofing  tar  and  coke  oven  emissions,  we  have  included 
experiments  on  the  dichloromethane  extracts  of  particulate  onissions  fron  one 
light  duty  diesel  engine. 

Except  for  the  epidemiological  studies,  experiments  appearing  in  the  same 
row  were,  as  above,  performed  under  identical  conditions  in  the  same 
laboratory.  Ihe  new  slopes  and  standard  errors  were,  as  above,  estimated  by 
maximum  likelihood  methods,  as  described  in  Harris  (1981) .  No  epidemiological 
study  of  the  human  lung  cancer  risks  from  exposure  to  light  duty  diesel  engine 
exhaust  was  available.  Although  the  corresponding  cell  is  left  anpty,  we  note 
that  the  set  of  available  experiments  is  connected,  as  defined  in  Section  3.1 

above. 

The  results  in  Table  3  clearly  reveal  inconsistencies  in  the  constant 
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TABLE  3. 
3x3  Experimental  Data  Matrix 


Roofing 

Coke 

Diesel 

Tar 

Oven 

Engine 

Emissions 

Emissions 

Emissions 

Lung  Cancer  (Man) 

1.64 

4.40 

slope 

1.41 

0.34 

coef .var . 

0.49 

1.48 

log  slope 

Skin  Tumor  Initiation 

0.54 

2.10 

0.53 

(Senear  Mice) 

0.04 

0.04 

0.04 

-0.63 

0.74 

-0.64 

Enhancement  of  Viral 

2.07 

0.86 

0.65 

Transformation 

0.18 

0.10 

0.15 

(SHE  Cells)* 

0.73 

-0.15 

-0.44 

*Transformations/2xl0  cells  per  Mg/ml  extract  . 
Units  for  other  rows  as  in  Table  1, 

There  are  no  data  for  lung  cancer  risk  of  diesel  engine  emissions 
in  man. 
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relative  potency  hypothesis.  In  the  skin  tumor  initiation  experiments,  for 
example,  roofing  tar  emission  extracts  were  less  potent  than  coke  oven 
emission  extracts.  In  the  viral  transformation  studies,  roofing  tar  onission 
extracts  were  more  potent  than  coke  oven  emissions  extracts. 

Figure  3  shows  the  posterior  densities  ITCctiy)  corresponding  to  this 
3x3  experimental  data  matrix.  The  results  for  both  prior  distributions  A 
and  B,  described  in  Section  4  above,  are  shewn.  We  continue  to  assume  a 
diffuse  prior  on  jff  \  In  contrast  to  the  2x2  case,  the  posterior 
distribution  of  <r  is  considerably  less  sensitive  to  the  prior  distribution 
that  is  assumed.  The  likelihood  function  is  now  more  concentrated  around 
<^£^  =  0,726.  In  the  range  cr  <0.2,  the  posterior  density  of  cr  is 
virtually  zero. 

These  findings  are  reflected  in  Figure  4.  Like  Figure  1,  this  figure 
shows  the  means  and  standard  errors  of  the  original  data  on  a  logarithmic 
scale.  Superimposed  on  these  data  are  the  posterior  mean  values  of  X^, 
derived  from  Prior  A.  Within  each  species,  consecutive  pairs  of  these 
posterior  mean  data  points  have  been  connected  by  dashed  lines.  Since  the 
data  points  for  each  emission  are  equally  spaced  along  the  horizontal,  and 
since  the  three  logarithmic  vertical  axes  are  drawn  to  the  same  scale,  the 
underlying  constant  relative  potency  hypothesis  requires  that  the  dashed  lines 
connecting  each  pair  of  estimates  be  parallel.  As  Figure  4  shows,  the 
underlying  model  predictions  X^*  in  effect  strike  a  balance  between  the 
contradictory  elements  in  the  original  data. 

I^ble  4  shows  selected  statistics  of  the  posterior  distributions  of   X(J 
and  6  for  the  human  lung  cancer  slopes,  based  on  Priors  A  and  B.  Also  shown 


are  the  empirical  Bayes  estimates  corresponding  to  (3.16),  where  empirical 


Bayes  estimate  1  uses  <r        and  empirical  Bayes  estimate  2  substitutes  cr^^^ 
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TABLE  4. 

Bayes  and  Empirical  Bayes  Estimates  of  Log  Slopes 

For  Lung  Cancer  Risk  in  Man 

3x3  Data  Matrix 


Environmental 

Stand. 

Poster. 

Lower 

Upper 

Emission 

Mean 

Dev. 

Mean  Xp 

Tail 

Tail 

Roofing  Tar 

Original  Data 

0.495 

1.415 

6|Y  (Prior  B) 

0.818 

1.058 

0.945 

0.014 

0.010 

elY  (Prior  A) 

0.832 

1.036 

0.952 

0.015 

0.010 

0|Y  (E. Bayes  1) 

0.884 

0.960 

0.987 

GlY  (E. Bayes  2) 

0.861 

0.995 

0.972 

Coke  Oven 

Original  Data 

1.482 

0.341 

e|Y  (Prior  B) 

1.463 

0.337 

1.336 

0.010 

0.010 

elY  (Prior  A) 

1.462 

0.336 

1.341 

0.010 

0.010 

e|Y  (E. Bayes  1) 

1.459 

0.336 

1,356 

elY  (E. Bayes  2) 

1.460 

0.336 

1.348 

Diesel  Engine 

e|Y  (Prior  B) 

0.434 

1.875 

0.434 

0.017 

0.015 

6\Y    (Prior  A) 

0.442 

1.818 

0.442 

0.017 

0.016 

elY  (E. Bayes  1) 

0.466 

1,217 

0.466 

e|Y  (E. Bayes  2) 

0.454 

1.300 

0.455 

©|Y  (Empirical  Bayes  1)  assumes  prior  "Tr(cr)  concentrated  at  Cgg 

=  0.726,  and  diffuse  prior  on  /?  . 
©lY  (Empirical  Bayes  2)  assumes  prior  ir(o')  concentrated  at  cc 


=  0.782,  and  diffuse  prior  on  /S 


RSS 
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for  cr^^  . 

In  cxmparison  to  the  results  for  the  2>c2  case  (Table  2) ,  the  standard 
deviations  of  the  posterior  distributions  for  the  roofing  tar  log  slope  were 
considerably  less  sensitive  to  the  estination  method  used.  For  both  roofing 
tar  and  coke  oven  emissions,  the  posterior  mean  values  of  6  now  deviate  from 
the  corresponding  posterior  mean  values  of  xp . 

In  the  diesel  engine  case,  however,  the  contrast  between  the  Bayes  and 
empirical  Bayes  estimates  is  more  striking.  Because  there  were  no  original 
epidemiological  data  in  this  case,  the  posterior  precision  of  9  depends 
solely  on  our  assumptions  about  the  hyperparameter  a"  .     Whereas  c^^  =  0.726 

A 

and  ^-^s  ~  0«'782  for  these  data,  the  Bayes  estimates  are  a*  =  1.150,  based 
on  Prior  A,  and  <j*  =  1.189,  based  on  Prior  B. 

The  scientist  who  voices  skepticism  at  the  close  fit  of  the  data  in 
Figure  1  has,  it  appears,  been  vindicated.  If  we  take  cr  to  he  its  maximum 
likelihood  estimate  cr^  =  0.726,  then  extrapolations  between  species  or 
environmental  agents,  we  conclude,  will  be  accurate  only  to  a  multiplicative 
factor  of  2  with  68  percent  probability  and  only  to  a  multiplicative  factor  of 
4  with  95  percent  probability.  If  we  take  cr  to  be  the  Bayes  estimate  cr*  = 
1,150  (based  on  Prior  A),  then  such  extrapolations,  we  conclude,  can  be 
accurate  only  to  a  multiplicative  factor  of  exp(1.15)=3  with  68  percent 
probability  and  only  to  a  multiplicative  factor  of  exp (2x1.15)210  with  95 
percent  probability. 

We  are  now  in  a  position  to  contrast  our  statistical  approach  with  others 
in  the  literature.  Our  equation  (3.2)  partitions  the  sources  of  variation 
among  experiments  into  several  components.  We  thus  follow  Cochran's  (1980) 
suggestion  that  "the  summary  of  a  series  of  experients  calls  mainly  for 
experience  in  the  analysis  of  variance."   In  our  decomposition  of  these 
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sources  of  variation,  however,  we  do  not  assume  that  the  true  values  of  the 
log  slopes  exactly  obey  an  additive  model.  We  assume  only  that  the  true 
values  lie  within  ±(r  of  such  a  model  with  probability  0.68.  Further,  when 
confronted  with  the  problem  of  estimating  the  slope  for  a  particular 
experiment,  we  have  no  belief  that  the  slope  in  question  is  at  all  unusual  in 
its  deviation  fron  the  underlying  equal  relative  potency  model.  For  us,  the 
set  of  all  such  deviations  is  exchangeable.  The  distinction  between  the  Bayes 
and  the  empirical  Bayes  approaches  depends  on  our  willingness  either  to  assign 
a  prior  distribution  to  (T  (then  integrating  with  respect  to  c-  )  or  to  use  a 
point  estimate  for  cr  as  if  it  were  known.  We  prefer  the  full  Bayesian 
procedure,  especially  when  there  are  relatively  few  experiments,  since  the 
uncertainty  in  cr  is  real  and  should  contribute  to  our  uncertainty  about  B  . 
This  point  is  illustrated  by  the  Bayes  and  empirical  Bayes  standard  deviations 
of  6  for  diesel  engine  emissions  in  Table  4.  On  the  other  hand,  v^en  many 
experiments  are  combined,  we  expect  that  the  choice  of  prior  distribution  and 
the  choice  between  Bayes  and  empirical  Bayes  estimates  will  be  less  important. 
(See,  e.g.,  Tiao  and  Zellner  (1964).) 


(Dne  procedure  suggested  by  Lindley  and  Smith  (1972)  and  Smith  (1973a) , 

'ML£ 


A 

which  they  describe  as  "modal  Bayesian,"  amounts  to  the  use  of  cr    in  our 


A 

formula  (3.16)  for  C  .  We  would  describe  this  approach  as  yet  another 
version  of  anpirical  Bayes.  Smith  (1973a)  shows  that  if  cr  is  known,  then 
the  Bayesian  confidence  intervals  will  be  shorter  than  the  classical 
confidence  intervals  for  0  .  Our  allowing  for  uncertainty  in  <T  will  tend 
to  lengthen  the  confidence  intervals,  but  they  will  still  be  shorter  than  the 
classical  intervals  for  6:  based  solely  on  the  sampling  errors  c-  from 
each  experiment. 

The  fully  Bayesian  analysis  of  Smith  (1973a)  differs  from  ours  in  several 
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respects.  Since  he  concentrates  on  the  two-^way  table  with  no  replications, 
the  components  of  error  that  we  call  C  and  cr  I  are  combined  in  that  paper 
as  if  C  were  equal  to  zero.  Moreover,  we  explicitly  calculate  the  posterior 
distribution  of  cr  in  order  to:  (a)  show  the  range  of  uncertainty  ranaining 
for  cr ;  (b)  calculate  the  value  cr  for  use  in  our  empirical  Bayes 
procedure;  and  (c)  determine  c^  =  E[c7-  |y]  ,  which  can  be  interpreted  as  a 
Bayesian  risk  of  interspecies  extrapolation  if  loss  is  proportional  to  S  = 
(O-X^)  .  The  statistic  cr*  will  also  play  a  critical  role  in  the  diagnostic 
procedure  of  the  next  section. 

Put  there  are  still  two  serious  problems  with  our  analysis  of  the  data  of 
Tfeble  3.  First,  we  note  that  the  more  precise  data  cone  from  non-human 
experiments.  One  may  legitimately  protest  that  we  have  merely  learned  how 
accurately  we  can  extrapolate  from  mouse  skin  to  hamster  onbryo  cells.  At  the 
very  least,  some  test  of  the  assumption  of  exchangeable  extrapolation  errors 
seems  appropriate.  Ideally,  we  should  include  the  results  of  more  precise 
human  experiments  in  our  analysis. 

Second,  we  have  so  far  said  nothing  about  the  choice  of  experiments  to  be 
included  in  the  analysis.  Harris  (1981)  selected  these  laboratory  bioassays 
because  they  were  considered  to  be  valuable  quantitative  measures  of 
carcinogenicity,  and  because  tests  of  several  related  onissions  were  performed 
in  the  same  laboratory.  The  U.S.  Enviroimental  Protection  Agency  had  chosen 
these  specific  emissions  as  part  of  its  diesel  emission  research  program 
(Huisingh  et  al.,  1979).  Although  we  have  presented  only  a  few  experiments 
initially  for  expository  purposes,  it  is  hardly  clear  what  would  happen  if  we 
were  to  include  many  more  experiments.  What  is  more,  there  is  no  obvious 
means  of  deciding  which  experiments  are  most  appropriate  to  include. 
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6.   SELECTING  AND  REJECTING  EXPERIMENTS 
6.1  The  5x9  Case 

Table  5  further  augments  the  experimental  data  in  Table  3.  In  addition 
to  lung  cancer  epidemiological  studies  in  man,  skin  tumor  initiation 
experiments  in  mice,  and  viral  transformation  studies  in  SHE  cells,  we  have 
included  mutagenesis  experiments  in  L5178Y  mouse  lymjiiana  cells  performed 
under  two  types  of  conditions  (Mitchell  et  al. ,  1979) .  In  the  rcw  denoted 
Mutagenesis-MA,  no  metabolic  activator  was  added.  In  the  row  denoted 
Mutagenesis+MA,  metabolic  activator  was  included  in  the  experiiiental 
preparation.  Thus,  both  direct  and  indirect  mutagenicity  were  measured. 

In  addition  to  the  three  emissions  given  in  T&ble  3,  we  have  included 
three  other  diesel  engine  emission  samples;  a  sample  of  particulate  emissions 
fron  a  gasoline-powered  automobile  engine;  the  polyarcmatic  hydrocarbon 
benzo(a)pyrene;  and  cigarette  smoke  condensate  fron  the  Kentucky  lAl 
experimental  cigarette,  which  was  designed  to  be  typical  of  cigarettes  smoked 
during  the  1950s.  The  diesel  engine  extract  appearing  in  Table  3  has  been 
relabeled  Diesel  I,  while  the  remaining  diesel  emission  samples  have  been 
numbered  from  II  to  IV.  Diesel  emissions  II  and  III  were,  like  Diesel  I, 
obtained  frcm  light  duty  diesel  engines.  Diesel  emission  IV  was  obtained  from 
a  heavy  duty  diesel  engine.  The  conditions  of  collection  of  these  samples  are 
described  in  Huisingh  et  al.  (1979).  With  the  exception  of  the  results  for 
cigarette  smoke  condensate,  all  do^-response  slopes  and  the  standard  errors 
are  taken  fron  Harris  (1981) . 

Although  Harris  (1981)  did  not  report  the  corresponding  dose-response 
slopes  for  cigarette  smoke  condensate,  experiments  on  this  agent  were  reported 
in  the  source  studies  (Casto  et  al.,  1979;  Mitchell  et  al.,  1979;  Nesnow  et 
al,,  1979).  We  were  therefore  able  to  estimate  these  slopes  by  the  same 
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maximum  likelihood  methods  used  by  Harris  (1981) .  For  the  human  lung  cancer 
results r  we  applied  Harris's  estimation  procedure  to  the  U.S.  veterans  study- 
data  for  men  aged  35  to  84  observed  during  1954  to  1962  (Kahn,  1966,  ;^>pendix 
Tables  A  to  D) .  If  hCd^t)  is  the  incidence  of  lung  cancer  among  men  of  age  t 
with  accumulated  dose  dy  we  obtained  a  maximum  likelihood  estimate  of  the 
parameter  ^  for  the  relative  risk  model 

h(t,d)  =  h(trO)  (1  +^d). 

(The  estimation  algorithm  is  described  in  DuMouchel  (1981).)  With  accumulated 
&)se  measured  in  cigarettes  per  day  x  years,  our  maximum  likelihood  estimate 
for  C  was  1.085  x  10"^  units  of  incremental  relative  risk  per  cigarettes/day 
>^  years  (standard  error,  0.103  x  10  ).  This  estimate  was  then  converted  into 
units  of  incranental  relative  risk  per  10~'*'|jig/m  cigarette  snoke  condensate  x 
years  under  the  assumption  that  the  typical  cigarette  smoked  by  a  subject 

delivered  38  ±  2  mg  cigarette  smoke  condensate,  and  that  the  total  c^ily 

3 
delivery  of  condensate  was  diluted  in  a  total  (felly  ventilation  of  11  ±  2  m  . 

(We  used  the  methods  described  by  Harris  to  incorporate  the  uncertainty  about 

these  dosage  conversion  units  into  the  slope  and  coefficient  of  variation 

reported  in  Table  5.) 

Except  for  the  last  two  columns,  the  additional  experiments  in  Table  5 

were  performed,  as  above,  on  the  dichlororethane  extracts  of  the  various 

anissions.   For  the  benzo(a)pyrene  results,  this  agent  was  applied  in 

concentrated  form  as  a  positive  control  in  sane  experiments.  For  the  last 

column,  whole  EJtioke  condensate  was  used.  The  resulting  dose-response  units, 

we  note,  are  still  compatible  with,  the  constant  relative  potency  model.  For 

any  two  pairs  (k,l)  and  (k',i')  ,  the  quantity  9^£  -  ^k'I  ~    ^k4' "*"  ^k'Z' 
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remains  diitiensionless. 

Table  6  reports  the  Bayes  estimates  of  the  log  slopes  for  human  lung 
cancer  risk.  The  first  two  columns  (denoted  2x2  and  3x3)  sunmarize  the 
results  of  the  previous  two  sections.  Ihe  third  column  provides  the 
corresponding  results  for  the  5x9  experimental  data  matrix  in  Table  5.  ihe 
right-most  column  shows  the  original  data.  The  remaining  columns  will  be 
described  monentarily.  Only  the  results  for  roofing  tar,  coke  oven,  and 
diesel  engine  I  onissions  are  given.  The  original  slope  for  the  human  lung 
cancer  risk  from  cigarette  smoke  was  so  precise  that  its  posterior  density  did 
not  change  substantially.  Hence,  it  is  not  reported. 

The  eiis)iricia4..3ayes  .  jestimate  of  cT  for  the  5x9  data  matrix  was  1.041. 
Because  the  posterior  density  -rr(<5'lY)  was  highly  concentrated  around  <r  , 
the  corresponding  value  of  cr*  was  nearly  equal  to  cr^g  .  in  comparison  to 
the  3x3  case,  the  standard  deviation  of  the  posterior  density  of  the  roofing 
tar  log  slope  has  increased  slightly.  By  contrast,  the  corresponding  standard 
deviation  for  diesel  engine  I  has  declined.  These  results  reflect  a  balance 
between  two  sources  of  uncertainty  about  8  .  On  the  one  hand,  a  large 
posterior  value  of  <r  implies  uncertainty  in  the  deviation  S  •  On  the  other 
hand,  the  larger  number  of  experiments  permits  us  to  estimate  X^  more 
precisely. 

Figure  5  depicts  the  deviations  in  these  data  from  the  underlying 
constant  relative  potency  model.  For  each  of  the  five  species,  the  Figure 
shows  the  posterior  mean  values  of  the  residuals  S  =  6  -  xp>  for  each 
Qtaresion.  When  there  are  no  data  y  for  a  particular  species-emission  pair, 
the  posterior  mean  of  S  is  necessarily  zero.  Such  cases  are  therefore 
omitted  fron  the  Figure. 

The  posterior  mean  residuals  for  cigarette  smoke.  Figure  5  shc^vs,  are  in 
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TABLE  6. 

Bayes  Estimates  of  Log  Slopes  For  Lung  Cancer  Risk  in  Man 
Based  on  Alternative  Data  Matrices 


2x2    3x3    5x9    4x9    4x8    3x8    3x7   Data 


C:„      0.0     0.726   1.041   0.872   0.674   0.389   0.316 


e& 


1.043      1.150      1.080      0.933      0.730      0.480      0.395 


Roofing 
Tar 

e* 

0.229 
0.788 

0.832 
1.036 

0.123 
1.108 

0.306 
0.957 

0.959 
0.915 

1.526 
0.742 

0.497 
1.414 

0.495 
1.415 

Coke 
Oven 

1.497 
0.334 

1.462 
0.336 

1.375 
0.335 

1,366 
0.334 

1.455 
0.335 

1.422 
0.334 

1.482 
0.341 

Diesel 
Engine 

I 

0.442 
1.818 

-0.458 
1.451 

-0,706 
1.304 

0.207 
1.160 

0.330 
0.867 

-0.836 
1.582 

Prior  A  for  -rrCc)  and  diffuse  prior  for  (S    assumed  for  all 
calculations. 
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four  of  five  cases  relatively  large  in  absolute  value.  Cigarette  smoke  is 
apparently  a  weaker  human  lung  carcinogen,  a  weaker  mouse  skin  tumor 
initiator,  a  more  potent  transforming  agent,  and  a  more  potent  direct  mutagen 
than  would  be  predicted  in  this  case  fron  the  constant  relative  potency  model. 
Further,  the  range  of  the  mean  residuals  is  largest  for  the  mutagenesis 
experiments  in  the  absence  of  metabolic  activator.  Tests  for  indirect 
mutagenicity  are  apparently  least  compatible  with  the  underlying  model.  By 
contrast,  the  residuals  for  mutagenesis  with  activator  are  more  concentrated 
around  the  origin.  A  similar  finding  applies  to  the  viral  transformation 
results  when  cigarette  smoke  is  eliminated. 

6.2  A  Diagnostic  Procedure. 
We  seek  a  method  to  determine  which  subset  of  experiments  is  most 
relevant  for  predicting  lung  cancer  risks  in  man.  TWo  basic  characteristics, 
we  suggest,  are  critical  to  such  a  procedure. 

First,  the  method  ought  to  be  sensitive  to  the  underlying  tradeoff 
between  predictive  bias  and  predictive  precision.  Suppose  that  we  are 
interested  in  a  particular  6^  for  which  there  is  little  or  no  data  (i.e., 
c^  is  large).  The  inclusion  of  irrelevant  experiments  in  the  data  matrix 
could  result  in  a  biased  estimate  of  B^  ,  the  size  of  this  bias  being  in  the 
order  of  +(7.  However,  if  we  eliminated  all  but  the  most  relevant 
experiments,  the  remaining  experiments  could  contribute  little  if  anything  to 
the  accuracy  of  our  estimate  of  ©^  ,  as  measured  by  c^.  This  difficulty 
applies  especially  to  the  case  where  we  have  no  original  data  y-  on  6^ 
(e.g.,  the  human  lung  cancer  risks  for  diesel  engine  enissions  in  Table  5). 
If  we  eliminated  every  conceivably  irrelevant  experiment,  then  we  would  end  up 
with  exactly  what  we  had  at  the  start —  no  information  on  ©j  at  all. 
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Second,  we  should  not  eliminate  experiments  individually.  One  could 
exclude  those  species-eraission  pairs  with  large  posterior  mean  values  of  S  . 
Since  these  interactions  are  what  determines  the  relatedness  of  the  various 
species  and  onissions/  such  a  procedure  would  defeat  the  purpose  of  our 
analysis.  It  would  be  more  appropriate  to  assess  whether  a  specific  species 
or  a  specific  environmental  agent  is  more  or  less  relevant  to  the  others.  We 
therefore  adopt  a  cross-validation  procedure  based  on  the  elimination  of 
entire  rows  or  columns  frati  the  matrix  of  experiments. 

Let  Y^_  be  the  vector  of  log  slopes  formed  by  exclusion  of  all 
experiments  involving  species  k.  For  each  species  k,  we  evaluate  the 
posterior  density  "Tr((rlY^_)  ,  and  denote  <7*  =  E[cr  lYj^_]  ''*■  .  We  shall  say 
that  species  k  is  "less  relevant"  if  <T*  <<7*  ,  where  cr*  =  El <^  \Y]^*-  as  in 
(4.1).  Analogous  definitions  apply  to  Y  -  and  cr*  for  each  environmental 
agent  Z  .  The  species  or  agent  for  which  cr*  or  crj^  is  lowest  will  be 
termed  the  "least  relevant".  The  least  relevant  species  or  onission  is  the 
one  whose  elimination  most  iirproves  the  relevance  of  the  ranaining  experiments 
to  each  other.  In  anticipation  of  Section  7,  we  note  that  the  terms  less 
relevant  and  least  relevant  are  &  posteriori  concepts. 

Given  an  initial  set  of  experiments  Y  ,  a  prior  density  TTCo-)  ,  and  a 
particular  9{  of  interest,  we  consider  the  following  data  analytic 
procedure. 

(i)  Calculate  cr*  and  c*  for  each  k  and  X  ,  and  determine  the  least 
relevant  species  or  emission. 

(ii)  Calculate  the  posterior  distribution  of  Q-  before  and  after  the 
least  relevant  species  or  emission  is  removed. 

(iii)  Eliminate  the  least  relevant  species  or  emission  and  repeat  steps 
(i)  and  (ii)  on  the  reduced  set  of  experiments  so  long  as:   (a)  there  exists  a 
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less  relevant  experiment;  (b)  the  least  relevant  species  or  emission  does  not 
correspond  to  &^  ;  and  (c)  the  elimination  of  the  least  relevant  species  or 
emission  reduces  c*  ,  the  posterior  standard  deviation  of  6^  , 

(iv)  If  conditions  (a),  (b) ,  and  (c) ,  are  not  satisfied^  the  procedure 
terminates.  The  remaining  experiments  are  considered  most  relevant  for 
predicting  B^  . 

We  applied  this  procedure  to  the  5x9  data  matrix  in  Table  5.  Prior  A  on 
cr  was  assumed.  We  focused  on  predicting  the  human  lung  cancer  slopes  for 
roofing  tar  emissions  and  diesel  I  emissions. 

Steps  (i) ,  (ii) ,  and  (iii)  were  repeated  four  times.  Figure  6  depicts 
the  distrifcwtion  of  values  of  <r*  and  <7*  for  each  iteration,  where 
successive  iterations  are  displayed  fron  left  to  right.  To  assist 
interpretation,  a  few  species  and  agents  are  specifically  identified. 

For  the  original  5X9  data  matrix,  cr*  =  1.08.  Mutagenesis  without 
metabolic  activation  was  least  relevant.  Removal  of  this  row  resulted  in  a 
4X9  matrix  with  a  new  cr*  =  0.933.  Repeating  this  procedure,  we  found 
cigarette  snxDke  to  be  least  relevant.  Removal  of  this  column  resulted  in  a 
4X8  matrix  with  a  new  cr*  =  0.730.  Again  repeating  this  procedure,  we  found 
skin  tumor  initiation  in  mice  to  be  least  relevant.  Removal  of  this  row 
resulted  in  a  3x8  matrix  with  a  new  c*  =  0.480.  In  the  final  iteration,  coke 
oven  emissions  were  found  to  be  least  relevant,  with  o"*^^  =  0.395. 
Elimination  of  coke  oven  emissions  fron  the  3x8  matrix  violated  condition  (c) 
in  step  (iii)  above.  Hence,  the  procedure  was  terminated  and  the  3x8  array 
was  deemed  most  relevant. 

The  results  of  this  procedure  are  summarized  in  those  columns  of  Table  6 
labelled  5X9,  4x9,  4x8,  3x8,  and  3x7.  As  we  successively  eliminate 
mutagenesis  without  activator,  cigarette  smoke,  and  skin  tumor  initiation,  the 
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posterior  standard  deviations  of  roofing  tar  and  diesel  I  emissions  decline. 
Elimination  of  coke  oven  emissions,  the  least  relevant  in  the  3x8  array, 
resulted  in  a  marked  increase  in  the  posterior  standard  deviations  of  the 
roofing  tar  and  diesel  emissions  parameters.  In  the  case  of  roofing  tar,  the 
estimates  0*  and  c*  were  almost  identical  to  the  original  values  of  y 
and  c. 

The  resulting  tradeoff  between  predictive  bias  (c*)  and  predictive 
efficiency  (c*.  )  is  depicted  graphically  in  Figure  7.  By  successive 
elimination  of  least  relevant  experiments,  we  are  able  to  reduce  a*  to  0.48. 
Any  further  reduction  in  o*    is  at  the  cost  of  a  marked  loss  of  precision. 

Unless  we  are  willing  to  specify  a  particular  loss  function,  we  cannot 
unequivocally  conclude  that  the  predictions  resulting  frcm  the  3x8  matrix  are 
most  preferred.  For  many  public  health  and  environmental  policy  applications, 
however,  a  reduction  in  the  extent  of  uncertainty  about  human  risks  is 
desired.  To  seek  to  eliminate  less  relevant  experiments,  so  long  as 
predictive  efficiency  is  reduced,  appears  to  be  appropriate  for  such 
situations.  (Our  procedure  depends  somewhat  upon  the  choice  of  prior 
distribution  icier)  ,  but  when  many  experiments  are  in\7olved,  this  dependence 
should  be  minimal.) 

The  human  lung  cancer  experiments,  we  note  fron  Figure  6,  are  less 
relevant  (i.e.  ^t^,na.n-  *^  ^*  ^  so  long  as  cigarette  smoke  is  included  in  the 
data  matrix.  This  conclusion  does  not  apply  to  the  4^8  or  3>«8  arrays,  with 
cigarette  sirioke  removed.  Cigarette  smoke  contains  numerous  carcinogenic  and 
mutagenic  compounds  other  than  polyaromatic  hydrocarbons,  e.g.,  nitrosamines 
and  various  heterocyclics.  The  apparent  deviations  of  the  cigarette  smoke 
data  from  the  constant  relative  potency  model  may  reflect  these  differences  in 
chemical  composition.  Peto  (1977)  has  similarly  remarked  that  the  mutagenic 
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potency  of  cigarette  smoke  appears  to  be  greater  than  would  be  expected  from 
its  systemic  carcinogenicity.  Whatever  its  interpretation,  our  procedure 
leads  us  to  eliminate  the  most  precise  human  experiment.  As  a  consequence, 
our  estimate  of  C  is  constructed  primarily  fron  the  non-human  data.  We  see 
no  completely  satisfactory  response  to  this  limitation  other  than  to  suggest, 
where  possible,  the  inclusion  of  other  precise  human  data. 

Nevertheless,  we  find  the  results  of  our  diagnostic  procedure 
intriguiyjg  Assays  for  indirect  mutagenicity  and  tumor  initiation  have  been 
excluded  as  less  relevant.  The  retaining  laboratory  bioassays  are  designed  to 
guage  an  agent's  interference  with  gene  replication  and  cell  differentiation. 
For  the  polyaromatic-containing  emissions  ronaining  in  the  3x8  table,  these 
biological  processes  could  be  critical  to  human  lung  carcinogenesis. 

We  recognize  that  the  above  cross-validation  procedure  is  purely  data 
analytic.  Adapting  the  methods  in  Efron  and  Morris  (1973)  to  the  current 
problen,  we  could  replace  our  assumption  that  V(5-)  =  cr^  (for  all  i)  with  a 
more  general  specification.  Biat  is,  we  might  assume  V(S.)  =  T^  for  seme 
subset  of  experiments  and  then  enploy  a  joint  prior  distribution  for  (cr,T  ) 
to  derive  the  posterior  distribution  of  0  .  Unless  the  "suspicious  subset" 
can  be  identified  a  priori y  our  present  method  appears  to  be  much  simpler  in 
practice. 

7.   HKFECTLY  REPLICATED,  IMPERFECTLY  REPLICATED, 
AND  STRdOLY  RELATED  EXPERIMENTS. 

7.1  Definitions. 

In  seme  situations,  we  may  have  additional  prior  information  on  the 

relationships  between  experiments.   In  Table  5,  for  example,  it  is  not 

unreasonable  to  posit  that  diesel  engine  emissions  I  through  IV  ought  to  be 

more  related  to  each  other  than  to  the  remaining  environnental  agents.  An 
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analogous  assumption  might  apply  if  experimental  data  were  available  in  two 
closely  related  species,  or  two  strains  of  the  same  species,  or  even  males  and 
females  of  the  same  species. 

We  now  investigate  how  one  might  take  advantage  of  this  form  of  prior 
information.  For  this  purpose,  we  need  to  characterize  precisely  how 
experiments  can  be  related  a  priori  in  terms  of  our  statistical  model.  A 
rather  different  classification  of  degrees  of  relatedness  is  given  by  Smith 
(1973c) . 

Consider  the  results  y-  of  a  particular  experiment.  Its  mean  6  ■  can 
be  separated  into  two  conponents,  x-ft  and  S-  ,  with  0-  =  x,ft  +  S"^- .  We 
shall  call  two  experiments  i  and  i'  "perfectly  replicated"  if  x  S  =  x.,e>  and 
^l  =  ^i*  ^  priori.  Under  this  definition,  the  only  source  of  variation  in 
(y. -y-/)  is  the  sampling  error  associated  with  each  experimental 
observation. 

We  shall  call  two  experiments  i  and  i'  "imperfectly  replicated"  if  x-fi  = 
X/B  a  priori,  but,  conditional  on  C" ,  the  hyperparameters  S"-  and  S/  are 
a  priori  independent  N(0,o'),  If  x-j3  is  highly  correlated  a  priori  with 
x-/^  ,  we  shall  say  that  the  corresponding  experiments  are  "strongly 
related."  The  term  "highly  correlated"  is  tonporarily  left  vague.  Finally, 
all  pairs  of  experiments  that  are  neither  perfectly  replicated,  imperfectly 
replicated,  or  strongly  related  will  be  defined  as  "weakly  related." 

The  case  of  perfectly  replicated  experiments  presents  no  special  problans 
for  the  present  paper.  If  y.±c.  and  y.,±c-/  are  the  sufficient  statistics 
for  the  two  experiments,  we  merely  replace  than  by  the  single  statistic 
y^„  ±c-„  ,  where 


y...  =   (c^,y.  +  c^y.,  )/(c^^  +  c*/). 
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2-  ■  _-2•^-  '/2. 


c.„  =  (c'  +  c.,  ) 


We  might  apply  this  procedure  if  two  different  experiments  were  independently 
performed  in  the  same  species  with  the  same  agent  but,  say,  in  different 
laboratories.  This  case  will  not  be  considered  further. 

When  two  experiments  are  imperfectly  replicated,  we  admit  the  possibility 
of  independent  deviations  frcm  the  underlying  regression  model,  as  well  as 
independent  sampling  errors.  When  the  hyperparameter  ^  has  prior 
distribution  N(b,V)  ,  imperfect  replication  implies  that  (conditional  on  <r) 
the  slopes  6^  and  9^,  have  &.  pcieiLL  identical  prior  means  xb  ,  identical 
variances  xVx'  +  cr^,  and  correlation  coefficient 


(7.1)     xVx'/(xVx'  +0-^)  , 


where  x.  =  x.,  =  x    are  row  vectors. 

When  a  diffuse  prior  is  assumed  for  A  ,  we  must  apply  the  definition  of 
imperfect  replication  with  care.  If  V  is  replaced  by  tV  and  t->«>  ,  the 
&  pEifiia correlation  coefficient  (7«1)  approaches  unity.  Since  d^  and  0j/ 
are  normally  distributed  with  identical  means  and  variances,  a  correlation 
coefficient  of  1  would  imply  that  6-  =  0^/  &  priori .  But  this  would  imply 
that  0-  =  ©■/  a  posterioici,  even  if  (y- -y-/ )/(c^j  +C(^  )'''^  is  large.  The 
assumption  of  a  diffuse  prior  on  ^  apparently  reduces  the  notion  of 
irrperfect  replication  to  that  of  perfect  replication,  even  though  5"-  and 
S^t      are  assumed  to  differ  on  the  order  of  cr. 

This  difficulty  appears  to  be  related  to  the  class  of  marginal ization 
paradoxes  discussed  by  Dawid,  Stone,  and  Zidek  (1973) ,  which  are  known  to 
occur  sometimes  when  improper  prior  distributions  are  employed.  We  note  from 
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formula  (3.14)  in  Section  3.3,  however,  that  when  &  takes  on  a  diffuse 
prior,  E[eiY,o-]  =  (l+CR)"'y  ,  where  R  =  cr"^(I-X(X'X)"'x')  .  It  is 
straightforward  to  shew  that  for  this  formula,  E[6^- ©■/lY,o-]  is  zero  only 
when  y-  =  y^  ,  even  if  X|^  =  x^/.  The  paradox  that  &{  and  0^^  are 
perfectly  correlated  a  priori  is  avoided  so  long  as  we  use  (3.14)  to  evaluate 
the  posterior  means  and  variances  of    Q  . 

The  case  where  two  experiments  are  strongly  related  is  even  more  general. 
The  correlation  between  x-S  and  X/R  is  a  priori 

(7,2)     r  =  X -Vx  ?/  / [  (x  -Vx .'  )  (x  .,Vx  \, )  ]  '''^  , 


while  (conditional  on  <r  )   the  correlation  between  Q-    and  6-/    is  a  priori 
(7.3)     x.Vx.'/ /[(x.Vx!  +£^)  (x.^Vx?,+«/)] '/2-  , 

which  reduces  to  (7.1)  when  x^  =  x-/   (i.e.,  imperfect  replication).   Note 
that  the  correlation  between  d-     and  &■,    in  (7.3)  is  always  less  than  r 
in  absolute  value. 

We  do  not  wish  to  draw  a  sharp  boundary  between  the  terms  strongly 
related  and  weakly  related.  If  the  correlation  r  in  (7.2)  exceeds  0.9  in 
absolute  value,  we  would  certainly  call  the  corresponding  experiments  strongly 
related.  If  Irl  <  0.7,  or  r*  <  0.5  (i.e.,  the  "between  x^ "  variance 
falls  below  the  "within  x^"  variance),  then  we  would  use  the  term  weakly 
related.  When  x-  ^  x/  a  priori  and  a  diffuse  prior  is  assumed  for  /S  , 
i.e.,  the  assumptions  used  in  Sections  4,  5,  and  6,  we  regard  the 
corresponding  experiments  as  weakly  related. 
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7.2  Informative  Priors  on  /3  For  the  3>t8  Case. 
Figure  8  diagrams  a  3x8  array  of  experiments  similar  to  that  derived  from 
the  diagnostic  procedure  in  Section  6.  As  a  result  of  our  elimination  of  the 
skin  tumor  data,  the  remaining  single  experiment  on  benzo(a)pYrene  could  not 
affect  the  posterior  distributions  of  the  other  slopes.  Hence,  it  was 
removed.  In  its  place,  however,  we  have  added  the  results  of  an 
epidemiological  study  of  men  exposed  in  their  occupations  to  a  fifth  type  of 
diesel  engine,  a  heavy  duty  diesel  different  from  the  other  diesels.  This 
additional  slope  was  taken  from  Harris's  (1981)  analysis  of  lung  cancer 
incidence  among  London  Transport  Authority  diesel  bus  workers.  (The  slope 

estimate  was  originally  reported  in  units  of  incremental  relative  risk  per 

3 
Mg/m  particulates  x  years.  It  was  converted  to  units  of  incronental  relative 

risk  per  10  fug/m    extractable  organics  x  years  under  the  assumption  that  the 

dichloromethane  extractable  fraction  constitutes  18  percent  of  particulates  by 

weight.)   Ito  bioassay  studies  of  the  latter  type  of  diesel  engine  onission 

were  available. 

We  wish  to  introduce  our  prior  knowledge  that  the  experiments  in  diesel 

columns  I  through  V  are  more  related  to  each  other  than  to  the  remaining 

experiments.  Just  what  degree  of  inter relatedness  should  be  assumed  is  not 

clear.   It  is  implausible  that  the  diesel  experiments  in  each  row  should  be 

perfectly  replicated.   To  assume  that  they  are  imperfectly   replicated 

requires,  in  effect,  that  the  diesel  emissions  are  virtually  identical  in 

conposition,  but  that  different  diesel  engine  samples,  through  variations  in 

engine  design  or  operating  conditions,  may  result  in  idiosyncratic  effects  in 

some  species.  If  we  were  interested  primarily  in  the  possible  risks  of 

exposure  to  light  duty  diesel  emissions,  the  assumption  that  all  light  duty 

and  heavy  duty  diesel  experiments  were  imperfect    replicates  could  be  too 
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restrictive.  The  assumption  of  strongly  related  experiments  may  therefore  be 
more  useful. 

Of  course  we  could  fall  back  on  the  assumptions  of  unequal  x's  and  a 
diffuse  prior  on  ^  .  In  that  case,  however,  there  is  no  point  in  including 
the  observed  slope  for  diesel  V.  Its  value  of  y  would  always  be  perfectly 
fit  to  the  underlying  regression  model  by  the  additional  column  effect  that 
must  be  estimated,  and  it  would  not  contribute  toward  estimation  of  the  other 
Q  's.  Only  a  proper  prior  on  ^  would  reflect  our  belief  that  the  data  on 
the  human  occupational  exposure  to  diesel  engine  V  are  relevant  to  the 
estimation  of  lung  cancer  fron  exposure  to  the  other  diesel  engines. 

We  now  examine  in  detail  the  choice  of  a  proper  prior  for  /B-  .  Return  to 
equation  (3.2)  in  Section  3.1,  that  is. 


'kI    =  ^  +^^+  ^i-'  ^^  +  ^^5' 


where  k=l,2,3  correspond  to  the  three  species  and  1=1,..., 8  correspond  to 
the  eight  environmental  agents  in  Figure  8.  The  hyperparameters  {^j,...  ,'2''^} 
refer  specifically  to  the  various  diesel  effects.  A  natural  model  for  the 
relationship  between  these  hyperparameters  is 


(7.4)     »V  =  ^„    +-»!,,  i=3,.,.,7. 


where  ^^     is  a  conponent  common  to  all  diesel  engines  and  {"^j.^,..,  ,'*]rj} 
represent  deviations  of  each  diesel  fron  the  common  component.  Each  >jn  has 
prior  mean  zero  and  is  independent  of  the  other  >j«  and  of  Y^    . 

Now  denote  the  prior  variance  of  1^  by  v^  and  the  prior  variance  of 
71 «   by   v  .   If   v^  =  0  a  priori .  then  the  hyperparameters  { ^}  are 
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uncor related.  The  quantities  {M+<V+''i}  f   i.e.,  the  quantities  {x-6}  ,  will 
be  correlated  only  through  the  ccmmon  hyperparameters  {^,<^k)  -     Provided  that 
V-  is  not  anall  conpared  to  the  prior  variances  of   {1*+*^}  ,  the  diesel 
cx)luiTins  are  weakly  related. 

On  the  other  hand,  if  v  =  0  a  priori r   then  the  hyperparameters  {!(,} 
are  perfectly  correlated,  that  is,  the  diesel  columns  are  iirperfect  replicates. 
Finally,  if   v^   is  large  and  v  >0  is  small,  then  the  diesel  columns  are 
strongly  related. 

Since  there  are  K+L+1  =  3+8+1  =  12  components  in  the  hyperparameter  set 

{jx,aCj^,/£}  ,  we  must  specify  a  12^1  vector  b  of  prior  means  as  well  as  a 

12x12  prior  covariance  matrix   V  .   Ttiis  covariance  matrix  contains  a  5x5 

submatrix  of  covariances  among  the  diesel  column  effects  i^^,,.. ,'^y}   .   We 

now  make  the  following  numerical  assumptions. 

(i)  For  every  component  of  the  prior  mean  of  (f^y'tj;^,"?!^}  ,  we  choose  b  = 
0. 

(ii)  For  all  (j,j')  that  are  not  part  of  the  5«5  submatrix  of  diesel 
column  effects,  we  choose  V-  •,  =  lOOl  .j.  . 

(iii)  To  exemplify  weakly  related  experiments,  we  choose  v^  =  0  and  v^ 
=  100  .  In  this  case,  V  =  lOOl  ,  where  I  is  the  12x12  identity  matrix,  and 
for  any  pair  of  diesel  experiments  in  the  same  species,  the  correlation  r, 
defined  in  (7.2),  is  0.67. 

(iv)  To  exemplify  imperfectly  replicated  experiments,  we  replace  (iii) 
with  the  assumption  that  v^  =  100  and  v  =  0  ,  In  this  case,  the  number  of 
column  effects  L  is  reduced  to  4,  and  the  hyperparameter  set  i^,°'-Kf^i^ 
could  be  reduced  to  only  8  components.  The  corputations  are  equivalent  to  a 
3x4  analysis,  where  the  design  matrix  X  has  several  identical  rows  for  each 
diesel  engine,  and  V=100I  ,  where  I  is  the  8x8  identity  matrix. 
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(v)  To  exemplify  strongly  related  experiments,  we  replace  (ii)  and  (iii) 
with  the  asumption  that  v^  =  99  and  v^  =  1.  For  any  pair  of  diesel 
experiments  in  the  same  species,  the  correlation  r,  defined  in  (7.2),  is 
0.997.  Hhe  prior  covariance  matrix  V  takes  the  form 


V  = 


100  0 
0  100 


0 


0 
100 


0 

_„  , ,.,  . 

100  99  . 
99  100 

.  99 

• 

• 

99  .  . 

99 
99  100 

., .....     . .  ... 

100 

where  the  upper  left  submatrix  is  the  covariance  of  {M-,*icf^if^a,}  t  the 
submatrix  in  the  center  is  the  covariance  of  {'^3,...,^7}  ,  and  the  lower 
right  element  is  the  variance  of  Vy  . 

We  chose  a  value  of  100  for  the  prior  variance  of  the  hyperparameters 
^Hf'^K.'Xt^  to  reflect  our  near  complete  state  of  ignorance  about  these 
effects.  (It  is  possible,  as  Smith (1973a)  shows  in  a  sonewhat  simpler 
situation,  to  let  these  variances  go  to  infinity  and  still  retain  our  concept 
of  strongly  related  experiments  for  fixed  V„  .  Our  choice  of  a  large  finite 
value  for  v^  is  more  convenient  here.)  Under  this  prior  assumption,  every 
log  slope  0j  has  variance  300+ cr%  and  therefore  we  are  uncertain  about  the 
magnitude  of  each  slope  to  a  multiplicative  factor  of  about  exp(20)  =  5x10  . 
Since  these  variances  are  so  large,  the  somewhat  arbitrary  choice  of  b=0  does 
not  materially  affect  the  calculations,  since  all  values  of  y^  are  within 
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±5  of  Xb  =  0  .  Our  choice  of  v  =  1  in  (v)  reflects  our  sense  of  the 
likely  magnitude  of  the  difference  between  any  two  diesel  log  slopes  in  the 
saiae  species  (e.g.,  ©jj^-^tr).  The  a  priori  standard  deviation  of  this 
difference  is  /2(v_  +c)     conditionally  on  C  .     Even  if  cr=0,   our  choice  of 

v~  =1  means  the  ratio  of  the  two  slopes  is   exp(zv/2)  ,  v^ere   z   is 

N(0,1)  .   (Note  that  it  is  essential  to  choose  h9^a~"--ht    f°^  this  to  be 

f     B  it 

true.)  Since  exp(zy2)  =  10  for  z=1.6,  our  prior  choice  of  v  =  1  allows 
for  more  than  a  10  percent  chance  that  one  of  the  two  diesels  is  10  times  as 
potent  as  the  other.  No  greater  uncertainty  seons  justified.  At  the  other 
extreme,  to  assume  that  v  «  o"  would  be  nearly  equivalent  to  imperfect 
replication. 

Table  7  shows  the  resulting  estimates  of  c  and  of  the  0's  for  roofing 
tar,  coke  oven,  diesel  I  and  diesel  V  emissions.  Since  the  assumption  that 
the  diesel  columns  are  weakly  related  approximates  complete  ignorance  about 
j6  ,  the  results  in  that  column  are  quite  close  to  those  obtained  in  the 
diffuse  prior  case  in  Table  6  (see  the  column  labelled  Sj^S)  .  itie  log  slope 
for  the  diesel  V  epidemiological  study  contributes  to  the  estimation  of  cr 
and  to  the  remaining  9's  only  through  the  vague  prior  on  /3  .  Its  own  value 
of  y  is  nearly  perfectly  fit  to  the  underlying  model. 

The  assumption  of  imperfect  replication,  however,  results  in  a  dramatic 
increase  in  the  estimate  of  C .  Any  variation  in  the  diesel  d  ' s  that  is  in 
fact  due  to  the  1^'s  is  forced  to  be  fitted  to  the  corresponding  5's. 
Hence,  the  posterior  variance  of  5  increases.  Although  this  prior 
assumption  reduces  the  posterior  standard  deviation  for  Diesel  V,  the 
precision  of  the  other  estimates  deteriorates.  The  assumption  of  imperfect 
replication,  we  conclude,  is  inappropriate  in  the  present  illustration. 

Ihe  assumption  of  strongly  related  experiments  does  not  have  this 
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TABLE  7 . 

Bayes  Estimates  of  Log  Slopes  For  Lung  Cancer  Risk  in  Man 
Alternative  Assumptions  on  the  Relation  Between  Diesel  Columns 

3x8  Data  Matrix 


Weakly      Imperfect   Strongly    Original 
Related     Replicates   Related     Data 
(0,100)      (100,0)      (99,1) 


(^o    r  ^>,  ) 


h. 


0.388 
0.478 


1.100 
1.221 


0o416 
0.515 


Roofing  Tar 

6* 


1.533 

1.393 

1.724 

0.495 

0.739 

1.124 

0.743 

1.415 

Coke  Oven 


e* 


1.424 

1.504 

1.495 

1.482 

0,333 

0.336 

0.330 

0.341 

Diesel  Engine  I 


e* 

c* 


0.339 
0.862 


-0.401 
1.594 


0.442 
0.868 


Diesel  Engine  V 


e* 

c* 


1.877 
1.497 


0.443 
1.165 


0.241 
1.029 


1.921 
1.512 
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limitation.  Our  conservative  use  of  prior  information  about  the  r elatedness 
of  the  diesel  experiments  does  not  increase  (j*  much  beyond  0.5.  The 
posterior  standard  deviations  of  the  roofing  tar,  coke  oven  and  diesel  I 
emissions  are  increased  only  slightly  in  conparison  to  the  first  column  of  the 
table.  But  the  precision  of  the  estimate  for  diesel  V  is  improved.  T^ie 
incorporation  of  epidsniological  data  on  occupational  exposures  to  diesel 
engine  V  has  led  us  to  revise  upward  our  estimate  of  the  slope  for  diesel  I. 
Moreover,  the  results  of  the  viral  transformation  and  mutagenesis  experiments 
on  diesels  I  through  IV  have  led  us  to  revise  downward  our  estimate  of  the 
potency  of  diesel  V.  Father  than  being  more  potent  than  roofing  tar  or  coke 
oven  emissions,  as  originally  suggested  by  the  data,  diesel  V  is  likely  to  be 
3  or  4  times  less  potent,  although  not  conclusively  so. 

8.  DISOJSSICN  AND  OOSICLUSICNS 

We  have  constructed  a  general  framework  for  combining  the  results  of 
diverse  experiments  when  there  is  uncertainty  about  the  relevance  of  some 
experiments  to  others.  Within  this  framework,  we  have  attacked  the  specific 
problCTi  of  assessing  human  cancer  risks  f ran  heterogeneous  toxicological  and 
epidemiological  data. 

We  distinguish  between  the  conventional  sampling  error  inherent  in  each 
experiment  and  a  novel  error  of  imperfect  relevance  among  experiments.  "The 
latter  type  of  error  formalizes  our  notion  of  the  credibility  of  interspecies 
and  interagent  extrapolations.  We  shew  how  the  available  experimental  data, 
in  canbination  with  the  scientist's  prior  information  on  the  credibility  of 
such  extrapolations,  can  be  used  to  estimate  the  effects  of  various 
environmental  agents  in  man  and  other  species. 

For  a  relatively  simple  example  involving  two   species   and   two 
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environmental  agents,  we  shew  hew  the  scientist's  prior  information  can 
overric3e  the  data  in  predicting  human  cancer  risks.  When  we  add  more 
experiments  on  a  third  agent  and  in  a  third  species,  the  data  are  predominant. 
At  the  same  time,  a  scientist's  vague  prior  notion  of  the  magnitude  of  these 
extrapolative  errors  is  made  more  precise. 

We  then  propose  a  data  analytic  method  for  selecting  the  most  relevant 
subset  among  a  multitude  of  experiments.  The  main  idea  behind  this  method  is 
to  determine  which  species  or  which  envirormental  agent  contributes  most  to 
our  estimate  of  extrapolative  error.  Such  species  or  agents  are  successively 
eliminated  from  the  data  base  so  long  as  the  precision  of  a  particular 
estimate,  say,  a  particular  human  cancer  risk,  is  improved. 

We  apply  this  diagnostic  method  to  a  relatively  large  5x9  array, 
containing  36  observed  dose-response  slopes.  This  example  demonstrates  the 
tradeoff  between  prediction  bias  due  to  potentially  irrelevant  experiments  and 
prediction  efficiency  resulting  frem  cesnbining  diverse  experiments.  The 
analysis  tentatively  suggests  that  for  a  particular  class  of  environmental 
emissions  containing  polyarcmatic  hydrocarbons,  the  results  of  mammalian  cell 
transformation  experiments  and  mammalian  mutagenesis  experiments  with 
metabolic  activator  are  more  relevant  to  human  lung  cancer  risks  than 
mammalian  skin  tumor  initiation  experiments  or  tests  of  indirect  mutagenicity. 
Although  this  finding  may  not  ultimately  withstand  scrutiny,  it  was  derived, 
we  stress,  frcm  our  adoption  of  an  attitude  of  exploratory  analysis. 

Finally,  we  demonstrate  how  prior  information  on  the  relationship  between 
experiments  can  be  incorporated  into  the  analysis.  This  situation  is  likely 
to  occur  when  experiments  have  been  performed  in  the  same  species  or  in 
different  strains  of  the  same  species,  or  when  tests  have  been  performed  on 
multiple  samples  of  the  same  environmental  mixture. 
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The  main  limitation  of  the  present  analysis,  we  feel,  is  our  inability  to 
verify  that  the  assumption  of  exchangeable  errors  of  interspecies 
extrapolation  applies  to  humans.  The  reader  could  legitimately  object  that  we 
have  merely  assessed  hew  well  one  can  extrapolate  f ran  mouse  skin  to  hamster 
einbryo  cells  to  mouse  lymphoma  cells.  To  confirm  that  the  error  of 
extrapolation  is  exchangeable  among  species,  we  need  precise  human 
carcinogenesis  data. 

Our  analysis  in  Section  6  revealed  that  cigarette  smoke  is  a  more  potent 
direct  mutagen  and  a  more  potent  transforming  agent  in  cell  culture  than  would 
be  expected  from  its  observed  carcinogenic  potency  in  man.  When  we  excluded 
experiments  involving  cigarette  smoke,  the  estimate  of  cr  for  the  remaining 
data  declined.  We  do  not  regard  this  finding  as  strong  evidence  against 
exchangeability  of  extrapolation  errors  across  all  species.  However,  the 
conclusion  that  the  ronaining  data  fit  the  underlying  model  more  exactly,  we 
acknowledge,  is  based  primarily  on  the  more  precise  non-human  experimental 
data. 

We  should  note,  however,  that  the  assumption  of  spherical  errors 
(equation  (3.3c))  is  only  a  special  case.  The  covariance  matrix  tr  I  for  the 
extrapolative  errors  S  could  be  replaced  by  a  more  general  matrix.  For 
example,  the  deviations  S  corresponding  to  experiments  in  a  particular 
species  or  agent  could  have  a  variance  that  is  different  a.  priori  from  c  . 
We  have  not  explored  these  possibilities  in  this  initial  paper  because  the 
naive  exchangeability  assumption  seems  to  be  a  reasonable  starting  point. 

Moreover,  we  do  not  regard  the  basic  normal  data,  normal  prior  structure 
of  our  model  as  particularly  objectionable.  Deviations  from  the  underlying 
constant  relative  potency  model  may  arise  from  biological  processes  that  are 
non-gaussian.   Since  the  normal  distribution  has  smaller  tails  than  other 
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likely  candidates,  any  outliers  fran  the  underlying  normal  model  will  have  a 
stronger  contribution  to  the  overall  estimate  of  the  hyperparameter  a.  Our 
use  of  the  normal  model  is  thus  more  conservative  in  this  respect.  In  any 
case,  it  gives  analytical  formulas  that  permit  others  to  reproduce  our 
results. 

Nor  do  we  take  the  underlying  constant  relative  potency  model  to  be  an 
important  limitation.  Our  framework  could  easily  acconmodate  a  constant 
additive  potency  model  or,  for  that  matter r  any  regression  model  of  the  form 
Q  =  X^+i  ,  where  X  is  a  known  design  matrix.  The  appeal  of  the  constant 
relative  potency  concept  is  its  avoidance  of  potentially  complex  or 
implausible  conversions  of  dosage  units  between  species. 

Nor  do  we  attach  any  special  limitation  to  our  apparent  reliance  en  the 
slope  of  a  linear  dose-response  relationship.  Although  we  recognize  that 
there  is  considerable  support  for  such  a  dose-response  rrodel,,  it  should  be 
clear  that  alternative  methods  of  sunmarizing  the  results  of  an  experiment  are 
possible.  For  example,  the  TD^^  (dose  at  which  50  percent  of  subjects  develop 
clinical  toxicity)  or  the  MTD  (maximum  tolerated  dose)  could  be  used  for  each 
human  experiment,  as  in  Freireich  et  al.  (1966).  For  the  non-human  species,  at 
least,  the  LD^^  could  be  used,  as  in  Meselson  and  Russell  (1977) .  In  fact, 
our  model  could  be  generalized  to  the  multivariate  case  where  each  experiment 
is  summarized  by  a  vector  of  numbers.  In  this  way,  one  could  incorporate  the 
effect  of  such  additional  factors  as  the  effects  of  duration  and  fraction  of 
exposure,  or  possible  synergistic  effects  with  other  tumor  initiators  or 
prcmotors. 

The  model  described  in  this  paper  appears  to  provide  the  theoretical 
underpinning  for  the  other  previous  attempts  to  combine  carcinogenesis 
experiments.  Meselson  and  Russell's  (1977)  comparison  of  mutagenic  potency  in 
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Salmonella  with  carcinogenic  potency  in  rodents  constitutes  a  special  2xl  case 
in  our  framework.  In  fact,  the  present  methodology  can  be  used  to  resolve  the 
criticism  that  the  favorable  results  of  these  authors  were  merely  fortuitous. 

Similarly,  our  approach  resolves  the  difficulties  encountered  by  Crouch 
and  Wilson  (1979,1980)  in  having  to  perform  separate  canparisons  of 
carcinogenic  potency  in  different  pairs  of  sf^cies.  It  also  satisfies  these 
authors'  desire  for  a  systematic  approach  to  the  identification  of  potential 
exceptions  to  the  underlying  extrapolative  model.  Moreover,  the  use  of 
informative  priors  on  the  hyperparameters  of  the  model,  as  illustrated  in 
Section  7,  permits  us  to  include  multiple  experiments  on  the  same  agent  in  the 
same  species.  We  therefore  avoid  the  problem,  encountered  by  these  authors, 
of  deciding  which  of  several  experiments  to  incorporate  in  the  analysis.  By 
the  use  of  informative  priors,  we  could  also  incorporate  information  about  the 
faulty  design  or  execution  of  an  experiment.  Furthermore,  in  a  multivariate 
generalization  of  our  model,  we  could  incorporate  the  incidences  of  tumors  of 
different  sites.  This  would  avoid  the  additional  difficulty,  encountered  by 
these  authors,  of  deciding  which  of  several  endpoints  to  choose. 

Finally,  this  paper  considers  only  the  estimation  of  carcinogenic 
potency.  We  do  not  discuss  the  use  of  these  estimates,  in  cotibination  with 
data  on  the  extent  of  exposure  to  an  environmental  agent,  to  predict  possible 
excess  cancer  incidence.  Such  an  application  entails  additional  but  important 
uncertainties  that  are  beyond  the  scope  of  this  study. 
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